Hacker News

It's been some time since I last benchmarked defaultdict but last time I did (circa 3.6 and less?), it was considerably slower than judicious use of setdefault.

quietbritishjim 3 days ago [ - ]

One time that defaultdict may come out ahead is if the default value is expensive to construct and rarely needed:

    d.setdefault(k, computevalue())

defaultdict takes a factory function, so it's only called if the key is not already present:

    d = defaultdict(computevalue)

This applies to some extent even if the default value is just an empty dictionary (as it often is in my experience). You can use dict() as the factory function in that case.

But I have never benchmarked!

masklinn 3 days ago [ - ]

> if the default value is expensive to construct and rarely needed:

I'd say "or" rather than "and": defaultdict has higher overhead to initialise the default (especially if you don't need a function call in the setdefault call) but because it uses a fallback of dict lookup it's essentially free if you get a hit. As a result, either a very high redundancy with a cheap default or a low amount of redundancy with a costly default will have the defaultdict edge out.

For the most extreme case of the former,

    d = {}
    for i in range(N):
        d.setdefault(0, [])

versus

    d = defaultdict(list)
    for i in range(N):
        d[0]

has the defaultdict edge out at N=11 on my machine (561ns for setdefault versus 545 for defaultdict). And that's with a literal list being quite a bit cheaper than a list() call.

3 days ago [ - ]

[deleted]