Skip to content

Comments

More efficient JSON decoders and encoders for records#762

Merged
jdegoes merged 5 commits intozio:mainfrom
plokhotnyuk:more-efficient-json-codecs-for-records
Jan 2, 2025
Merged

More efficient JSON decoders and encoders for records#762
jdegoes merged 5 commits intozio:mainfrom
plokhotnyuk:more-efficient-json-codecs-for-records

Conversation

@plokhotnyuk
Copy link
Contributor

@plokhotnyuk plokhotnyuk commented Dec 31, 2024

BEWARE: Case classes with more than 22 fields for parsing and serialization use Schema.GenericRecord that under the hood uses transformation from/to ListMap (the most inefficient Scala collection that has O(n) complexity for lookups and inserts).

Below are results of benchmarks for real-world message samples using JDK-21 on Intel® Core™ i7-11800H CPU @ 2.3GHz (max 4.6GHz)

Scala 2.13

Before:

Benchmark                                   Mode  Cnt        Score        Error  Units
GeoJSONWriting.zioJson                     thrpt    5    11509.636 ±    426.210  ops/s
GeoJSONWriting.zioSchemaJson               thrpt    5    16104.759 ±    209.080  ops/s
GitHubActionsAPIReading.zioJson            thrpt    5   238538.723 ±   8177.055  ops/s
GitHubActionsAPIReading.zioSchemaJson      thrpt    5   222514.430 ±   5547.097  ops/s
GitHubActionsAPIWriting.zioJson            thrpt    5   342839.086 ±   7157.601  ops/s
GitHubActionsAPIWriting.zioSchemaJson      thrpt    5   323509.502 ±   4936.931  ops/s
GoogleMapsAPIPrettyPrinting.zioJson        thrpt    5    17059.393 ±    581.105  ops/s
GoogleMapsAPIPrettyPrinting.zioSchemaJson  thrpt    5    16573.563 ±    337.655  ops/s
GoogleMapsAPIReading.zioJson               thrpt    5    13982.431 ±    341.561  ops/s
GoogleMapsAPIReading.zioSchemaJson         thrpt    5    13842.595 ±    578.194  ops/s
GoogleMapsAPIWriting.zioJson               thrpt    5    24972.722 ±    468.115  ops/s
GoogleMapsAPIWriting.zioSchemaJson         thrpt    5    25915.911 ±    295.172  ops/s
OpenRTBReading.zioJson                     thrpt    5   114007.985 ±   1431.479  ops/s
OpenRTBReading.zioSchemaJson               thrpt    5    79211.901 ±    822.182  ops/s
TwitterAPIReading.zioJson                  thrpt    5    10678.123 ±    148.329  ops/s
TwitterAPIReading.zioSchemaJson            thrpt    5     7268.961 ±    216.017  ops/s
TwitterAPIWriting.zioSchemaJson            thrpt    5    14045.225 ±    230.728  ops/s

After:

Benchmark                                   Mode  Cnt        Score        Error  Units
GeoJSONWriting.zioJson                     thrpt    5    11555.041 ±    508.988  ops/s
GeoJSONWriting.zioSchemaJson               thrpt    5    15670.732 ±    280.056  ops/s
GitHubActionsAPIReading.zioJson            thrpt    5   226589.943 ±   6620.085  ops/s
GitHubActionsAPIReading.zioSchemaJson      thrpt    5   225792.266 ±   2472.357  ops/s
GitHubActionsAPIWriting.zioJson            thrpt    5   342785.015 ±   8113.470  ops/s
GitHubActionsAPIWriting.zioSchemaJson      thrpt    5   322699.803 ±  11433.594  ops/s
GoogleMapsAPIPrettyPrinting.zioJson        thrpt    5    17699.092 ±    257.323  ops/s
GoogleMapsAPIPrettyPrinting.zioSchemaJson  thrpt    5    18509.454 ±    246.413  ops/s
GoogleMapsAPIReading.zioJson               thrpt    5    13736.513 ±    512.711  ops/s
GoogleMapsAPIReading.zioSchemaJson         thrpt    5    14022.238 ±    361.512  ops/s
GoogleMapsAPIWriting.zioJson               thrpt    5    25187.089 ±    466.850  ops/s
GoogleMapsAPIWriting.zioSchemaJson         thrpt    5    26927.483 ±    303.798  ops/s
OpenRTBReading.zioJson                     thrpt    5   114339.425 ±   1126.331  ops/s
OpenRTBReading.zioSchemaJson               thrpt    5    94564.696 ±   2510.726  ops/s
TwitterAPIReading.zioJson                  thrpt    5     9604.793 ±    247.095  ops/s
TwitterAPIReading.zioSchemaJson            thrpt    5    13850.582 ±    174.689  ops/s
TwitterAPIWriting.zioSchemaJson            thrpt    5    16088.669 ±    927.031  ops/s

Scala 3.6.2

Before:

Benchmark                                   Mode  Cnt        Score       Error  Units
GeoJSONWriting.zioJson                     thrpt    5    12221.286 ±   522.041  ops/s
GeoJSONWriting.zioSchemaJson               thrpt    5    15297.340 ±   641.081  ops/s
GitHubActionsAPIReading.zioJson            thrpt    5   271564.817 ± 18327.367  ops/s
GitHubActionsAPIReading.zioSchemaJson      thrpt    5   268534.801 ±  5660.602  ops/s
GitHubActionsAPIWriting.zioJson            thrpt    5   368410.078 ±  4260.637  ops/s
GitHubActionsAPIWriting.zioSchemaJson      thrpt    5   338656.400 ±  1350.642  ops/s
GoogleMapsAPIPrettyPrinting.zioJson        thrpt    5    18925.725 ±   568.680  ops/s
GoogleMapsAPIPrettyPrinting.zioSchemaJson  thrpt    5    18316.578 ±   686.262  ops/s
GoogleMapsAPIReading.zioJson               thrpt    5    17410.773 ±   499.557  ops/s
GoogleMapsAPIReading.zioSchemaJson         thrpt    5    20159.010 ±   452.471  ops/s
GoogleMapsAPIWriting.zioJson               thrpt    5    27655.787 ±  1071.832  ops/s
GoogleMapsAPIWriting.zioSchemaJson         thrpt    5    24778.901 ±   888.373  ops/s
OpenRTBReading.zioJson                     thrpt    5   140866.407 ±  1125.914  ops/s
OpenRTBReading.zioSchemaJson               thrpt    5    73574.917 ±   943.538  ops/s
TwitterAPIReading.zioJson                  thrpt    5    15041.247 ±   284.596  ops/s
TwitterAPIReading.zioSchemaJson            thrpt    5     7441.366 ±    91.045  ops/s
TwitterAPIWriting.zioSchemaJson            thrpt    5    12892.386 ±   830.567  ops/s

After:

Benchmark                                   Mode  Cnt        Score        Error  Units
GeoJSONWriting.zioJson                     thrpt    5    11031.272 ±     60.439  ops/s
GeoJSONWriting.zioSchemaJson               thrpt    5    14985.481 ±    186.642  ops/s
GitHubActionsAPIReading.zioJson            thrpt    5   272805.697 ±   2739.527  ops/s
GitHubActionsAPIReading.zioSchemaJson      thrpt    5   274473.034 ±   6134.873  ops/s
GitHubActionsAPIWriting.zioJson            thrpt    5   370507.931 ±   5288.682  ops/s
GitHubActionsAPIWriting.zioSchemaJson      thrpt    5   337849.886 ±   9830.916  ops/s
GoogleMapsAPIPrettyPrinting.zioJson        thrpt    5    19294.778 ±    651.552  ops/s
GoogleMapsAPIPrettyPrinting.zioSchemaJson  thrpt    5    17828.131 ±    694.554  ops/s
GoogleMapsAPIReading.zioJson               thrpt    5    16806.799 ±    562.611  ops/s
GoogleMapsAPIReading.zioSchemaJson         thrpt    5    19931.393 ±   1104.529  ops/s
GoogleMapsAPIWriting.zioJson               thrpt    5    27761.813 ±   1087.806  ops/s
GoogleMapsAPIWriting.zioSchemaJson         thrpt    5    26460.697 ±    694.238  ops/s
OpenRTBReading.zioJson                     thrpt    5   143912.206 ±   2833.240  ops/s
OpenRTBReading.zioSchemaJson               thrpt    5    91562.192 ±    829.948  ops/s
TwitterAPIReading.zioJson                  thrpt    5    14458.395 ±    116.581  ops/s
TwitterAPIReading.zioSchemaJson            thrpt    5    10843.616 ±     66.909  ops/s
TwitterAPIWriting.zioSchemaJson            thrpt    5    14920.064 ±    702.144  ops/s

@plokhotnyuk plokhotnyuk requested a review from a team as a code owner December 31, 2024 11:51
@plokhotnyuk plokhotnyuk force-pushed the more-efficient-json-codecs-for-records branch 3 times, most recently from 38ad7c5 to 2ccd5a9 Compare January 2, 2025 09:00
case (key, value) =>
if (first)
first = false
if (first) first = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid unrelated format changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my attempt to make this peace of code looks similar though different copies in JsonCodec.scala

else {
out.write(',')
if (indent.isDefined) pad(indent_, out)
if (doPrettyPrint) pad(indent_, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the common case is not pretty printing. Should that not be the first branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The branch prediction for the common (non pretty printing) case is a smaller issue comparing to overhead of .isDefined , .isEmpty virtual calls and sub subsequent equality comparison with None.

} else {
val schema = field.schema match {
case l @ Schema.Lazy(_) => l.schema
case l: Schema.Lazy[_] => l.schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a difference?

Copy link
Contributor Author

@plokhotnyuk plokhotnyuk Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It does. It allows to avoid .unapply call in Scala 3 that allocates a redundant instance of Option[Tuple1].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the last commit beside formatting I fixed redundant 2 variables generated by Scala 3 compiler for that place. Now the byte code decompiled to Java looks more clear for that piece of code:

                        Schema schema = field.schema();
                        if (schema instanceof Schema.Lazy) {
                            Schema.Lazy l = (Schema.Lazy)schema;
                            schema = l.schema();
                        }

@jdegoes jdegoes merged commit bb5dbed into zio:main Jan 2, 2025
25 checks passed
@plokhotnyuk plokhotnyuk deleted the more-efficient-json-codecs-for-records branch January 3, 2025 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants