You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/kb/compression/fsicasestudy.md
+50-51Lines changed: 50 additions & 51 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,49 +8,13 @@ date: February 2025
8
8
9
9
In this document, we compare compression algorithms using a popular financial dataset from the New York Stock Exchange (NYSE). There are [three key metrics](../file-compression.md#performance) to evaluate compression algorithms.
10
10
11
-
1. Compression ratio
12
-
1. Compression speed
13
-
1. Decompression speed
11
+
1.**Compression ratio**
12
+
1.**Compression speed**
13
+
1.**Decompression speed**
14
14
15
-
These metrics impact **storage cost**, **data write time** and **query response times** respectively. Both compression and decompression speeds depend on the hardware - primarily on storage speed and the compute (CPU) capacity. Our partner, Intel(R), provided access to two systems with different storage characteristics in its FasterLab, a facility dedicated to optimization of Financial Services Industry (FSI) solutions. The first system has fast local disks, while the second system comes with a slower NFS storage. The next section describes these environments in detail.
15
+
These metrics impact **storage cost**, **data write time** and **query response times** respectively. Both compression and decompression speeds depend on the hardware - primarily on storage speed and the compute (CPU) capacity. Our partner, Intel(R), provided access to **two systems with different storage characteristics** in its FasterLab, a facility dedicated to optimization of Financial Services Industry (FSI) solutions. The first system has fast local disks, while the second system comes with a slower NFS storage. The next sections describe the results in detail.
16
16
17
-
18
-
## Infrastructure
19
-
20
-
Tests were conducted on version 9.4 of Red Hat Enterprise Linux using kdb+ 4.1 (version 2025.01.17). Compression performance depends on the **compression library versions**, which are listed below:
21
-
22
-
*`zlib`: 1.2.11
23
-
*`lz4`: 1.9.3
24
-
*`snappy`: 1.1.8
25
-
*`zstd`: 1.5.1
26
-
27
-
Key specifications for the two systems:
28
-
29
-
1. Local block storage and Intel Xeon 6 efficient CPU
***Storage**: NFS (version 4.2), mounted in sync mode, with read and write chunk sizes (`wsize` and `rsize`) 1 MB. NFS cache was not set up, i.e `-o fsc` mount parameter was not set.
The tests ran on a single NUMA node, i.e. kdb+ processes were launched with `numactl -N 0 -m 0`.
51
-
52
-
53
-
# Compression ratio
17
+
## Compression ratios
54
18
55
19
**Compression ratio** measures the relative reduction in size of data. This ratio is calculated by dividing the uncompressed size by the compressed size. For example, a ratio of 4 indicates that the data consumes a quarter of the disk space after compression. In this document, we show the **relative sizes** after compression, which is the inverse of compression ratios. Lower values indicate better compression. The numbers are in percentages, so 25 corresponds to compression ratio 4. The block size parameter was set to 17, which translates to logical block size of 128 KB.
56
20
@@ -1956,7 +1920,7 @@ Table `trade`:
1956
1920
1957
1921
`qipc` does not compress all columns by default. The conditions under which qipc applies compression are [documented](https://code.kx.com/q/basics/ipc/#compression) precisely.
1958
1922
1959
-
## Key Observations
1923
+
###Key Observations
1960
1924
1961
1925
***`gzip` and `zstd` deliver the best overall ratios**.
1962
1926
*`gzip` significantly outperforms `zstd` for `Sequence_Number` (except at `zstd`'s highest levels).
@@ -1965,7 +1929,7 @@ Table `trade`:
1965
1929
*`gzip` levels 6–9 show minimal difference, but level 1 performs poorly on low-entropy columns.
1966
1930
*`qipc` has the worst compression ratio among the tested algorithms.
1967
1931
1968
-
# Write speed, compression times
1932
+
##Write speed, compression times
1969
1933
The typical bottleneck of data ingestion is persiting tables to a storage. The write time determines the maximal ingestion rate.
1970
1934
1971
1935
Writing compressed data to storage involves three sequential steps:
@@ -2524,7 +2488,7 @@ The following tables provide a column-level breakdown. Green cells mark speed im
2524
2488
</tbody>
2525
2489
</table>
2526
2490
2527
-
## Key Observations
2491
+
###Key Observations
2528
2492
***Compression typically slows down `set` operations.**
2529
2493
* Notable exceptions: `snappy` and `zstd` level 1 actually improve write speed for certain column types. For these columns, `zstd` provides significantly better compression ratios than `snappy`.
2530
2494
* The level has a substantial impact on compression time, even for algorithms like `lz4`; for example, `zstd` level 10 is considerably faster than level 22.
@@ -3103,14 +3067,14 @@ Let us see how compression performs with a slower storage.
3103
3067
3104
3068
These results — smaller ratios compared to uncompressed set and more green cells — indicate that the performance benefits of compression are amplified on slower disks. Notably, only `zstd` at level 1 consistently outperforms uncompressed `set` across all columns, while other compression methods generally slow down the `set` operation.
3105
3069
3106
-
## Scaling, syncing and appending
3070
+
###Scaling, syncing and appending
3107
3071
Because the `set` command is single-threaded, kdb+ systems often persist columns in parallel by `peach` when memory allows. In our case, the number of columns is smaller than the available cores so parallelizing provided clear speed advantage. Persisting all columns simultaneously took roughly the same time as persisting the largest column (`TradeID`). In real life, the writer process may have other responsibilities like ingesting new data or serving queries. These responsibilities also compete for CPU.
3108
3072
3109
3073
Data persisted via `set` may remain in the OS buffer cache before being written to disk, risking data loss if the system crashes. The user can trigger the flush with the [fsync](https://man7.org/linux/man-pages/man2/fsync.2.html) system call. If kdb+ processes wrote several files simultaneously and a consistent state is desired then system calls fsync and [syncfs](https://linux.die.net/man/2/syncfs) are used. These calls block the kdb+ process and so their execution time contributes to the write time. In our experiment `fsync` times were marginal compared to `set`, especially on NFS. The network is the bottleneck for NFS and the underlying storage system has plenty of time to flush the data.
3110
3074
3111
3075
While `set` is a common persistence method, intraday writedowns often use appends, implemented by [amend at](https://code.kx.com/q/ref/amend/#on-disk) like `.[file; (); ,; chunk]`. `set` also chunks large vector writes behind the scenes, this explains why our test showed no speed difference between the two methods, regardless of compression.
3112
3076
3113
-
# Query response time
3077
+
##Query response times
3114
3078
When data is stored in a compressed format, it must be decompressed before processing queries. The decompression speed directly impacts query execution time.
3115
3079
3116
3080
In the query test, we executed 14 distinct queries. The queries vary in filtering, grouping and aggregation parameters. Some filters trigger sequential reads, and queries with several filtering constraints perform random reads. We included queries with explicit parallel iteration ([peach](https://code.kx.com/q/ref/each/)) and with [as-of](https://code.kx.com/q/ref/aj/) join as well. Data in the financial sector is streaming in simultaneously and as-of joins play a critical role in joining various tables.
@@ -3124,7 +3088,7 @@ The table below details each query’s performance metrics:
3124
3088
3125
3089
<divclass="kx-perf-compact"markdown="1">
3126
3090
3127
-
|**query**|**elapsed time ms**|storage read (KB)**|**query memory need (KB)**|**result memory need (KB)**|
3091
+
|**Query**|**Elapsed time ms**|**Storage read (KB)**|**Query memory need (KB)**|**Result memory need (KB)**|
3128
3092
|---|---:|---:|---:|---:|
3129
3093
|**select from quote where date=2022.03.31, i<500000000**|13379|36137692|44023417|39728448|
3130
3094
|**aj[`sym`time`ex; select from tradeNorm where date=2022.03.31, size>500000; select from quoteNorm where date=2022.03.31]**|6845|2820024|9797897|82|
@@ -3972,7 +3936,7 @@ To isolate caching effects, we cleared the page cache (`echo 3 | sudo tee /proc/
3972
3936
</tbody>
3973
3937
</table>
3974
3938
3975
-
The table below displays the second executions of the queries, i.e., data was sourced from memory (page cache) hence the storage speed impact is smaller. Query `` select medMidSize: med (bsize + asize) % 2 from quoteNorm where date=2022.03.31, sym=`CIIG.W `` without compression executed in less than 1 msec, so we rounded up the execution time to 1 msec to avoid division by zero.
3939
+
The table below displays the second executions of the queries, that is, data was sourced from memory. Because these used the page cache, the storage speed impact is smaller. Query `` select medMidSize: med (bsize + asize) % 2 from quoteNorm where date=2022.03.31, sym=`CIIG.W `` without compression executed in less than 1 msec, so we rounded up the execution time to 1 msec to avoid division by zero.
3976
3940
3977
3941
Observe OS cache impact - higher ratios and more dark red cells.
3978
3942
@@ -4871,7 +4835,7 @@ Observe OS cache impact - higher ratios and more dark red cells.
4871
4835
</tbody>
4872
4836
</table>
4873
4837
4874
-
## Key Observations
4838
+
###Key Observations
4875
4839
4876
4840
***Compression slows queries**, especially for CPU-bound workloads (e.g., multiple aggregations using [multi-threaded primitives](https://code.kx.com/q/kb/mt-primitives/)). Some queries were 20× slower with compression.
4877
4841
***OS caching amplifies slowdowns**: When data resides in memory, compression overhead becomes more pronounced. **Recommendation**: Avoid compression for frequently accessed ("hot") data.
@@ -5848,7 +5812,7 @@ Let us see how compression impacts query times if the data is stored on a slower
5848
5812
5849
5813
Compression improves performance when large datasets are read from slow storage. Thus, it is **recommended for cold tiers** (rarely accessed data).
5850
5814
5851
-
# Summary
5815
+
##Summary
5852
5816
5853
5817
For an optimal balance of cost and query performance, we recommend a tiered storage strategy
5854
5818
@@ -5861,7 +5825,42 @@ For an optimal balance of cost and query performance, we recommend a tiered stor
5861
5825
5862
5826
Not all tables require identical partitioning strategies. Frequently accessed tables may remain in the hot tier for extended durations. Conversely, even within a heavily queried table, certain columns might be seldom accessed. In such cases, **symbolic links can be used to migrate column files to the appropriate storage tier**.
5863
5827
5864
-
# Data
5828
+
## Infrastructure
5829
+
5830
+
Tests were conducted on version 9.4 of Red Hat Enterprise Linux using kdb+ 4.1 (version 2025.01.17). Compression performance depends on the **compression library versions**, which are listed below:
5831
+
5832
+
*`zlib`: 1.2.11
5833
+
*`lz4`: 1.9.3
5834
+
*`snappy`: 1.1.8
5835
+
*`zstd`: 1.5.1
5836
+
5837
+
Key specifications for the two systems:
5838
+
5839
+
1. Local block storage and Intel Xeon 6 efficient CPU
***Storage**: NFS (version 4.2), mounted in sync mode, with read and write chunk sizes (`wsize` and `rsize`) 1 MB. NFS cache was not set up, i.e `-o fsc` mount parameter was not set.
The tests ran on a single NUMA node, using local node memory only. This is achieved by launching the kdb+ processes with `numactl -N 0 -m 0`.
5861
+
5862
+
5863
+
## Data
5865
5864
5866
5865
We used [publicly available](https://ftp.nyse.com/Historical%20Data%20Samples/DAILY%20TAQ/) NYSE TAQ data for this analysis. Tables `quote` and `trade` were generated using the script [taq.k](https://github.com/KxSystems/kdb-taq). Table `quote` had 1.78 billion rows and consumed 180 GB disk space uncompressed. Table `trade` was smaller, contained 76 million rows and required 5.7 GB space. All tables were parted by the instrument ID (column `Symbol`). The data corresponds to a single day in 2022.
0 commit comments