Skip to content

Conversation

@tigrannajaryan
Copy link
Collaborator

Instead of storing stefz in git we generate them before tests. The files are fairly large and we store large delta everytime the format changes.

Files are now deleted from git. otlp2stef tool is used to generate stefz files from OTLP .pb files that don't change.

We also use a script to check compatibility of a format. The script checks out old version of otlp2stef, generates files from that, check outs current version of code and runs new test code on old files.

Instead of storing stefz in git we generate them before tests.
The files are fairly large and we store large delta everytime the
format changes.

Files are now deleted from git. otlp2stef tool is used to generate
stefz files from OTLP .pb files that don't change.

We also use a script to check compatibility of a format.
The script checks out old version of otlp2stef, generates
files from that, check outs current version of code and
runs new test code on old files.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the test file management approach by generating .stefz test files on-demand instead of storing them in git, addressing issues with large binary files and frequent format changes.

  • Removes .stefz files from git tracking and generates them from .zst source files using the otlp2stef tool
  • Adds compatibility testing scripts to validate format compatibility between versions
  • Updates file paths across the codebase to reference the new generated test files location

Reviewed Changes

Copilot reviewed 9 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
makefile Updates build-ci target to use new benchmarks build-ci command
java/src/test/java/tests/ReadWriteTest.java Updates test file path to reference generated directory
java/src/main/java/net/stef/benchmarks/STEF.java Updates benchmark file path to reference generated directory
benchmarks/testdata/.gitignore Ignores the generated test files directory
benchmarks/scripts/gentestfiles.sh New script to generate .stefz files from .zst sources
benchmarks/scripts/genoldtestfiles.sh New script for compatibility testing with old format versions
benchmarks/readwrite_test.go Updates test file paths to reference generated directory
benchmarks/makefile Adds test file generation targets and compatibility testing
.github/workflows/build-and-test.yml Adds test file generation steps to CI workflow

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions
Copy link

Benchmark Result

Benchmark diff with base branch
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-main.txt │           bench-new.txt            │
                                                 │     sec/op     │    sec/op     vs base              │
SerializeNative/STEF/none-4                          14.26m ±  8%   14.16m ±  4%       ~ (p=0.485 n=6)
SerializeNative/STEFU/none-4                         35.99m ±  2%   35.71m ±  1%       ~ (p=0.240 n=6)
DeserializeNative/STEF/none-4                        2.763m ±  1%   2.775m ±  2%       ~ (p=0.589 n=6)
DeserializeNative/STEFU/none-4                       8.651m ±  1%   8.626m ±  1%       ~ (p=0.589 n=6)
SerializeFromPdata/STEF/none-4                       230.6m ±  6%   243.5m ±  4%  +5.57% (p=0.015 n=6)
SerializeFromPdata/STEFU/none-4                      35.95m ±  1%   36.29m ±  3%       ~ (p=0.394 n=6)
DeserializeToPdata/STEF/none-4                       42.93m ±  3%   43.59m ±  3%       ~ (p=0.240 n=6)
DeserializeToPdata/STEFU/none-4                      63.69m ±  3%   63.70m ±  2%       ~ (p=0.937 n=6)
STEFReaderRead-4                                     2.751m ±  1%   2.746m ±  1%       ~ (p=0.394 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4        4.113 ±  8%    4.108 ±  7%       ~ (p=0.485 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     95.52m ± 21%   77.25m ± 22%       ~ (p=0.132 n=6)
ReadSTEF-4                                           3.010m ±  1%   2.813m ±  1%  -6.52% (p=0.002 n=6)
ReadSTEFZ-4                                          4.580m ±  1%   4.712m ±  2%  +2.88% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 9.780m ±  4%   8.860m ±  1%  -9.41% (p=0.002 n=6)
geomean                                              25.15m         24.64m        -2.04%

                                                 │ bench-main.txt │           bench-new.txt            │
                                                 │   sec/point    │  sec/point    vs base              │
SerializeNative/STEF/none-4                          213.3n ±  8%   211.7n ±  4%       ~ (p=0.455 n=6)
SerializeNative/STEFU/none-4                         538.3n ±  2%   534.0n ±  1%       ~ (p=0.197 n=6)
DeserializeNative/STEF/none-4                        41.32n ±  1%   41.50n ±  2%       ~ (p=0.589 n=6)
DeserializeNative/STEFU/none-4                       129.3n ±  1%   129.0n ±  1%       ~ (p=0.459 n=6)
SerializeFromPdata/STEF/none-4                       3.450µ ±  6%   3.641µ ±  4%  +5.54% (p=0.015 n=6)
SerializeFromPdata/STEFU/none-4                      537.6n ±  1%   542.8n ±  3%       ~ (p=0.422 n=6)
DeserializeToPdata/STEF/none-4                       642.1n ±  3%   652.0n ±  3%       ~ (p=0.240 n=6)
DeserializeToPdata/STEFU/none-4                      952.6n ±  3%   952.8n ±  2%       ~ (p=0.937 n=6)
STEFReaderRead-4                                     41.15n ±  1%   41.07n ±  1%       ~ (p=0.416 n=6)
STEFSerializeMultipart/astronomy-otelmetrics-4       5.227µ ±  8%   5.221µ ±  7%       ~ (p=0.485 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4    121.40n ± 21%   98.19n ± 22%       ~ (p=0.121 n=6)
ReadSTEF-4                                           45.04n ±  1%   42.11n ±  1%  -6.52% (p=0.002 n=6)
ReadSTEFZ-4                                          68.54n ±  1%   70.52n ±  2%  +2.87% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 146.4n ±  4%   132.6n ±  1%  -9.43% (p=0.002 n=6)
geomean                                              264.5n         259.1n        -2.04%

                                                 │ bench-main.txt │            bench-new.txt             │
                                                 │      B/op      │     B/op      vs base                │
SerializeNative/STEF/none-4                          3.558Mi ± 0%   3.560Mi ± 0%       ~ (p=0.394 n=6)
SerializeNative/STEFU/none-4                         7.128Mi ± 0%   7.128Mi ± 0%       ~ (p=0.589 n=6)
DeserializeNative/STEF/none-4                        911.5Ki ± 0%   911.5Ki ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/none-4                       1.564Mi ± 0%   1.564Mi ± 0%       ~ (p=0.455 n=6)
SerializeFromPdata/STEF/none-4                       166.0Mi ± 0%   166.0Mi ± 0%       ~ (p=0.818 n=6)
SerializeFromPdata/STEFU/none-4                      7.128Mi ± 0%   7.129Mi ± 0%       ~ (p=0.310 n=6)
DeserializeToPdata/STEF/none-4                       29.88Mi ± 0%   29.88Mi ± 0%       ~ (p=0.485 n=6)
DeserializeToPdata/STEFU/none-4                      36.63Mi ± 0%   36.63Mi ± 0%       ~ (p=1.000 n=6)
STEFReaderRead-4                                     911.5Ki ± 0%   911.5Ki ± 0%       ~ (p=1.000 n=6) ¹
STEFSerializeMultipart/astronomy-otelmetrics-4       3.961Gi ± 0%   3.961Gi ± 0%       ~ (p=0.937 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4     20.73Mi ± 0%   20.73Mi ± 0%       ~ (p=0.284 n=6)
ReadSTEF-4                                           911.0Ki ± 0%   911.5Ki ± 0%  +0.05% (p=0.002 n=6)
ReadSTEFZ-4                                          10.19Mi ± 0%   10.19Mi ± 0%  +0.00% (p=0.002 n=6)
ReadSTEFZWriteSTEF-4                                 13.61Mi ± 0%   13.61Mi ± 0%  +0.00% (p=0.002 n=6)
geomean                                              11.08Mi        11.08Mi       +0.01%
¹ all samples are equal

                                                 │ bench-main.txt │            bench-new.txt            │
                                                 │   allocs/op    │  allocs/op   vs base                │
SerializeNative/STEF/none-4                           2.894k ± 0%   2.898k ± 1%       ~ (p=0.485 n=6)
SerializeNative/STEFU/none-4                          1.080k ± 0%   1.079k ± 0%       ~ (p=0.719 n=6)
DeserializeNative/STEF/none-4                          669.0 ± 0%    669.0 ± 0%       ~ (p=1.000 n=6) ¹
DeserializeNative/STEFU/none-4                         729.0 ± 0%    729.0 ± 0%       ~ (p=1.000 n=6) ¹
SerializeFromPdata/STEF/none-4                        256.4k ± 0%   256.4k ± 0%       ~ (p=0.846 n=6)
SerializeFromPdata/STEFU/none-4                       1.081k ± 0%   1.081k ± 0%       ~ (p=0.273 n=6)
DeserializeToPdata/STEF/none-4                        622.8k ± 0%   622.8k ± 0%       ~ (p=1.000 n=6) ¹
DeserializeToPdata/STEFU/none-4                       811.5k ± 0%   811.5k ± 0%       ~ (p=1.000 n=6) ¹
STEFReaderRead-4                                       669.0 ± 0%    669.0 ± 0%       ~ (p=1.000 n=6) ¹
STEFSerializeMultipart/astronomy-otelmetrics-4        14.44M ± 0%   14.44M ± 0%       ~ (p=0.937 n=6)
STEFDeserializeMultipart/astronomy-otelmetrics-4      3.921k ± 0%   3.920k ± 0%       ~ (p=0.545 n=6)
ReadSTEF-4                                             669.0 ± 0%    669.0 ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEFZ-4                                            702.0 ± 0%    702.0 ± 0%       ~ (p=1.000 n=6) ¹
ReadSTEFZWriteSTEF-4                                  1.681k ± 0%   1.682k ± 0%       ~ (p=0.061 n=6)
geomean                                               8.216k        8.216k       +0.01%
¹ all samples are equal
Benchmark result
benchstat bench-new.txt
goos: linux
goarch: amd64
pkg: github.com/splunk/stef/benchmarks
cpu: AMD EPYC 7763 64-Core Processor                
                                                 │ bench-new.txt │
                                                 │    sec/op     │
SerializeNative/STEF/none-4                         14.16m ±  4%
SerializeNative/STEFU/none-4                        35.71m ±  1%
DeserializeNative/STEF/none-4                       2.775m ±  2%
DeserializeNative/STEFU/none-4                      8.626m ±  1%
SerializeFromPdata/STEF/none-4                      243.5m ±  4%
SerializeFromPdata/STEFU/none-4                     36.29m ±  3%
DeserializeToPdata/STEF/none-4                      43.59m ±  3%
DeserializeToPdata/STEFU/none-4                     63.70m ±  2%
STEFReaderRead-4                                    2.746m ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4       4.108 ±  7%
STEFDeserializeMultipart/astronomy-otelmetrics-4    77.25m ± 22%
ReadSTEF-4                                          2.813m ±  1%
ReadSTEFZ-4                                         4.712m ±  2%
ReadSTEFZWriteSTEF-4                                8.860m ±  1%
geomean                                             24.64m

                                                 │ bench-new.txt │
                                                 │   sec/point   │
SerializeNative/STEF/none-4                         211.7n ±  4%
SerializeNative/STEFU/none-4                        534.0n ±  1%
DeserializeNative/STEF/none-4                       41.50n ±  2%
DeserializeNative/STEFU/none-4                      129.0n ±  1%
SerializeFromPdata/STEF/none-4                      3.641µ ±  4%
SerializeFromPdata/STEFU/none-4                     542.8n ±  3%
DeserializeToPdata/STEF/none-4                      652.0n ±  3%
DeserializeToPdata/STEFU/none-4                     952.8n ±  2%
STEFReaderRead-4                                    41.07n ±  1%
STEFSerializeMultipart/astronomy-otelmetrics-4      5.221µ ±  7%
STEFDeserializeMultipart/astronomy-otelmetrics-4    98.19n ± 22%
ReadSTEF-4                                          42.11n ±  1%
ReadSTEFZ-4                                         70.52n ±  2%
ReadSTEFZWriteSTEF-4                                132.6n ±  1%
geomean                                             259.1n

                                                 │ bench-new.txt │
                                                 │     B/op      │
SerializeNative/STEF/none-4                         3.560Mi ± 0%
SerializeNative/STEFU/none-4                        7.128Mi ± 0%
DeserializeNative/STEF/none-4                       911.5Ki ± 0%
DeserializeNative/STEFU/none-4                      1.564Mi ± 0%
SerializeFromPdata/STEF/none-4                      166.0Mi ± 0%
SerializeFromPdata/STEFU/none-4                     7.129Mi ± 0%
DeserializeToPdata/STEF/none-4                      29.88Mi ± 0%
DeserializeToPdata/STEFU/none-4                     36.63Mi ± 0%
STEFReaderRead-4                                    911.5Ki ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4      3.961Gi ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4    20.73Mi ± 0%
ReadSTEF-4                                          911.5Ki ± 0%
ReadSTEFZ-4                                         10.19Mi ± 0%
ReadSTEFZWriteSTEF-4                                13.61Mi ± 0%
geomean                                             11.08Mi

                                                 │ bench-new.txt │
                                                 │   allocs/op   │
SerializeNative/STEF/none-4                          2.898k ± 1%
SerializeNative/STEFU/none-4                         1.079k ± 0%
DeserializeNative/STEF/none-4                         669.0 ± 0%
DeserializeNative/STEFU/none-4                        729.0 ± 0%
SerializeFromPdata/STEF/none-4                       256.4k ± 0%
SerializeFromPdata/STEFU/none-4                      1.081k ± 0%
DeserializeToPdata/STEF/none-4                       622.8k ± 0%
DeserializeToPdata/STEFU/none-4                      811.5k ± 0%
STEFReaderRead-4                                      669.0 ± 0%
STEFSerializeMultipart/astronomy-otelmetrics-4       14.44M ± 0%
STEFDeserializeMultipart/astronomy-otelmetrics-4     3.920k ± 0%
ReadSTEF-4                                            669.0 ± 0%
ReadSTEFZ-4                                           702.0 ± 0%
ReadSTEFZWriteSTEF-4                                 1.682k ± 0%
geomean                                              8.216k

@tigrannajaryan tigrannajaryan marked this pull request as ready for review September 12, 2025 20:14
@tigrannajaryan tigrannajaryan merged commit 830d672 into main Sep 16, 2025
10 checks passed
@tigrannajaryan tigrannajaryan deleted the tigran/genstefzfiles branch September 16, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants