Why is dataclasses.asdict(obj)
significantly slower than obj.__dict__()
in Python?
I am using Python 3.6 with the dataclasses
backport package from ericvsmith. It appears that calling dataclasses.asdict(my_dataclass)
is around 10x slower than accessing my_dataclass.__dict__
:
@dataclass
class MyDataClass:
a: int
b: int
c: str
I tested this with the following code:
%%time
_ = [MyDataClass(1, 2, "A" * 1000).__dict__ for _ in range(1_000_000)]
CPU times: user 631 ms, sys: 249 ms, total: 880 ms
Wall time: 880 ms
And the following with dataclasses.asdict()
:
%%time
_ = [dataclasses.asdict(MyDataClass(1, 2, "A" * 1000)) for _ in range(1_000_000)]
CPU times: user 11.3 s, sys: 328 ms, total: 11.6 s
Wall time: 11.7 s
Is this expected behavior? When should I use dataclasses.asdict(obj)
instead of obj.__dict__()
?
Note: Using __dict__.copy()
doesn’t make a significant difference:
%%time
_ = [MyDataClass(1, 2, "A" * 1000).__dict__.copy() for _ in range(1_000_000)]
CPU times: user 922 ms, sys: 48 ms, total: 970 ms
Wall time: 970 ms
What factors contribute to the performance difference when converting a Python dataclass to a dict?