Skip to content

json.dump(x,f) is much slower than f.write(json.dumps(x)) #129711

Open
@wjmelements

Description

@wjmelements

Bug report

Bug description:

Experimentally I measured a huge performance improvement when I switched my code from

json.dump(x, f, **)

to

f.write(json.dumps(x, **))

Method

I essentially wrote the same contents to different files sequentially and measured the total amount of time taken. The json contents had 1, 300, and 400 entries per level, and 1, 5, and 6 levels of depth. There's quite a level of variance here but this wasn't what I was trying to measure in the first place. I discovered this by chance, so forgive the lack of precision. I also don't have the source code anymore because I wasn't originally planning to report this discovery.

Results

File Size Consecutive Files dump µs dumps µs
74 1 508 581
74 2 520 541
74 4 1153 1151
74 8 1930 1750
39184 1 6363 1086
39184 2 11261 1821
39184 4 38126 3521
39184 8 80411 6466
468218 1 82821 11921
468218 2 150234 38017
468218 4 302357 42137
468218 8 573450 78545

Conclusion

A cursory investigation into the cpython code suggests that the slow part is the sequential writing of the iterencode yield. The chunks are quite small.

CPython versions tested on:

3.10

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    extension-modulesC modules in the Modules dirperformancePerformance or resource usagestdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    • Status

      No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions