Skip to content

Commit daaa60d

Browse files
KXI-61765 typo fixes and restructuring
1 parent ddd3f58 commit daaa60d

File tree

1 file changed

+50
-51
lines changed

1 file changed

+50
-51
lines changed

docs/kb/compression/fsicasestudy.md

Lines changed: 50 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -8,49 +8,13 @@ date: February 2025
88

99
In this document, we compare compression algorithms using a popular financial dataset from the New York Stock Exchange (NYSE). There are [three key metrics](../file-compression.md#performance) to evaluate compression algorithms.
1010

11-
1. Compression ratio
12-
1. Compression speed
13-
1. Decompression speed
11+
1. **Compression ratio**
12+
1. **Compression speed**
13+
1. **Decompression speed**
1414

15-
These metrics impact **storage cost**, **data write time** and **query response times** respectively. Both compression and decompression speeds depend on the hardware - primarily on storage speed and the compute (CPU) capacity. Our partner, Intel(R), provided access to two systems with different storage characteristics in its FasterLab, a facility dedicated to optimization of Financial Services Industry (FSI) solutions. The first system has fast local disks, while the second system comes with a slower NFS storage. The next section describes these environments in detail.
15+
These metrics impact **storage cost**, **data write time** and **query response times** respectively. Both compression and decompression speeds depend on the hardware - primarily on storage speed and the compute (CPU) capacity. Our partner, Intel(R), provided access to **two systems with different storage characteristics** in its FasterLab, a facility dedicated to optimization of Financial Services Industry (FSI) solutions. The first system has fast local disks, while the second system comes with a slower NFS storage. The next sections describe the results in detail.
1616

17-
18-
## Infrastructure
19-
20-
Tests were conducted on version 9.4 of Red Hat Enterprise Linux using kdb+ 4.1 (version 2025.01.17). Compression performance depends on the **compression library versions**, which are listed below:
21-
22-
* `zlib`: 1.2.11
23-
* `lz4`: 1.9.3
24-
* `snappy`: 1.1.8
25-
* `zstd`: 1.5.1
26-
27-
Key specifications for the two systems:
28-
29-
1. Local block storage and Intel Xeon 6 efficient CPU
30-
* **Storage**: Intel SSD D7-P5510 (3.84 TB), with interface PCIe 4.0 x4, NVMe
31-
* **CPU**: Intel(R) Xeon(R) [6780E](https://www.intel.com/content/www/us/en/products/sku/240362/intel-xeon-6780e-processor-108m-cache-2-20-ghz/specifications.html) (**E**fficient series)
32-
* Sockets: 2
33-
* Cores per socket: 144
34-
* Thread(s) per core: 1
35-
* NUMA nodes: 2
36-
* filesystem: ext4
37-
* memory: 502GiB, DDR5 6400 MT/s, 8 channels
38-
1. NFS storage and Intel Xeon 6 performance CPU
39-
* **Storage**: NFS (version 4.2), mounted in sync mode, with read and write chunk sizes (`wsize` and `rsize`) 1 MB. NFS cache was not set up, i.e `-o fsc` mount parameter was not set.
40-
* Some network parameters:
41-
* MTU: 1500
42-
* TCP read/write buffer size (`/proc/sys/net/core/rmem_default`, `/proc/sys/net/core/wmem_default`): 212992
43-
* **CPU**: Intel(R) Xeon(R) [6747P](https://www.intel.com/content/www/us/en/products/sku/241825/intel-xeon-6747p-processor-288m-cache-2-70-ghz/specifications.html) (**P**erformance series)
44-
* Sockets: 2
45-
* Cores per socket: 48
46-
* Thread(s) per core: 2
47-
* NUMA nodes: 4
48-
* memory: 502GiB, DDR5 6400 MT/s, 8 channels
49-
50-
The tests ran on a single NUMA node, i.e. kdb+ processes were launched with `numactl -N 0 -m 0`.
51-
52-
53-
# Compression ratio
17+
## Compression ratios
5418

5519
**Compression ratio** measures the relative reduction in size of data. This ratio is calculated by dividing the uncompressed size by the compressed size. For example, a ratio of 4 indicates that the data consumes a quarter of the disk space after compression. In this document, we show the **relative sizes** after compression, which is the inverse of compression ratios. Lower values indicate better compression. The numbers are in percentages, so 25 corresponds to compression ratio 4. The block size parameter was set to 17, which translates to logical block size of 128 KB.
5620

@@ -1956,7 +1920,7 @@ Table `trade`:
19561920

19571921
`qipc` does not compress all columns by default. The conditions under which qipc applies compression are [documented](https://code.kx.com/q/basics/ipc/#compression) precisely.
19581922

1959-
## Key Observations
1923+
### Key Observations
19601924

19611925
* **`gzip` and `zstd` deliver the best overall ratios**.
19621926
* `gzip` significantly outperforms `zstd` for `Sequence_Number` (except at `zstd`'s highest levels).
@@ -1965,7 +1929,7 @@ Table `trade`:
19651929
* `gzip` levels 6–9 show minimal difference, but level 1 performs poorly on low-entropy columns.
19661930
* `qipc` has the worst compression ratio among the tested algorithms.
19671931

1968-
# Write speed, compression times
1932+
## Write speed, compression times
19691933
The typical bottleneck of data ingestion is persiting tables to a storage. The write time determines the maximal ingestion rate.
19701934

19711935
Writing compressed data to storage involves three sequential steps:
@@ -2524,7 +2488,7 @@ The following tables provide a column-level breakdown. Green cells mark speed im
25242488
</tbody>
25252489
</table>
25262490

2527-
## Key Observations
2491+
### Key Observations
25282492
* **Compression typically slows down `set` operations.**
25292493
* Notable exceptions: `snappy` and `zstd` level 1 actually improve write speed for certain column types. For these columns, `zstd` provides significantly better compression ratios than `snappy`.
25302494
* The level has a substantial impact on compression time, even for algorithms like `lz4`; for example, `zstd` level 10 is considerably faster than level 22.
@@ -3103,14 +3067,14 @@ Let us see how compression performs with a slower storage.
31033067

31043068
These results — smaller ratios compared to uncompressed set and more green cells — indicate that the performance benefits of compression are amplified on slower disks. Notably, only `zstd` at level 1 consistently outperforms uncompressed `set` across all columns, while other compression methods generally slow down the `set` operation.
31053069

3106-
## Scaling, syncing and appending
3070+
### Scaling, syncing and appending
31073071
Because the `set` command is single-threaded, kdb+ systems often persist columns in parallel by `peach` when memory allows. In our case, the number of columns is smaller than the available cores so parallelizing provided clear speed advantage. Persisting all columns simultaneously took roughly the same time as persisting the largest column (`TradeID`). In real life, the writer process may have other responsibilities like ingesting new data or serving queries. These responsibilities also compete for CPU.
31083072

31093073
Data persisted via `set` may remain in the OS buffer cache before being written to disk, risking data loss if the system crashes. The user can trigger the flush with the [fsync](https://man7.org/linux/man-pages/man2/fsync.2.html) system call. If kdb+ processes wrote several files simultaneously and a consistent state is desired then system calls fsync and [syncfs](https://linux.die.net/man/2/syncfs) are used. These calls block the kdb+ process and so their execution time contributes to the write time. In our experiment `fsync` times were marginal compared to `set`, especially on NFS. The network is the bottleneck for NFS and the underlying storage system has plenty of time to flush the data.
31103074

31113075
While `set` is a common persistence method, intraday writedowns often use appends, implemented by [amend at](https://code.kx.com/q/ref/amend/#on-disk) like `.[file; (); ,; chunk]`. `set` also chunks large vector writes behind the scenes, this explains why our test showed no speed difference between the two methods, regardless of compression.
31123076

3113-
# Query response time
3077+
## Query response times
31143078
When data is stored in a compressed format, it must be decompressed before processing queries. The decompression speed directly impacts query execution time.
31153079

31163080
In the query test, we executed 14 distinct queries. The queries vary in filtering, grouping and aggregation parameters. Some filters trigger sequential reads, and queries with several filtering constraints perform random reads. We included queries with explicit parallel iteration ([peach](https://code.kx.com/q/ref/each/)) and with [as-of](https://code.kx.com/q/ref/aj/) join as well. Data in the financial sector is streaming in simultaneously and as-of joins play a critical role in joining various tables.
@@ -3124,7 +3088,7 @@ The table below details each query’s performance metrics:
31243088

31253089
<div class="kx-perf-compact" markdown="1">
31263090

3127-
|**query**|**elapsed time ms**|storage read (KB)**|**query memory need (KB)**|**result memory need (KB)** |
3091+
|**Query**|**Elapsed time ms**|**Storage read (KB)**|**Query memory need (KB)**|**Result memory need (KB)** |
31283092
|---|---:|---:|---:|---:|
31293093
|**select from quote where date=2022.03.31, i<500000000**|13379|36137692|44023417|39728448|
31303094
|**aj[`sym`time`ex; select from tradeNorm where date=2022.03.31, size>500000; select from quoteNorm where date=2022.03.31]**|6845|2820024|9797897|82|
@@ -3972,7 +3936,7 @@ To isolate caching effects, we cleared the page cache (`echo 3 | sudo tee /proc/
39723936
</tbody>
39733937
</table>
39743938

3975-
The table below displays the second executions of the queries, i.e., data was sourced from memory (page cache) hence the storage speed impact is smaller. Query `` select medMidSize: med (bsize + asize) % 2 from quoteNorm where date=2022.03.31, sym=`CIIG.W `` without compression executed in less than 1 msec, so we rounded up the execution time to 1 msec to avoid division by zero.
3939+
The table below displays the second executions of the queries, that is, data was sourced from memory. Because these used the page cache, the storage speed impact is smaller. Query `` select medMidSize: med (bsize + asize) % 2 from quoteNorm where date=2022.03.31, sym=`CIIG.W `` without compression executed in less than 1 msec, so we rounded up the execution time to 1 msec to avoid division by zero.
39763940

39773941
Observe OS cache impact - higher ratios and more dark red cells.
39783942

@@ -4871,7 +4835,7 @@ Observe OS cache impact - higher ratios and more dark red cells.
48714835
</tbody>
48724836
</table>
48734837

4874-
## Key Observations
4838+
### Key Observations
48754839

48764840
* **Compression slows queries**, especially for CPU-bound workloads (e.g., multiple aggregations using [multi-threaded primitives](https://code.kx.com/q/kb/mt-primitives/)). Some queries were 20× slower with compression.
48774841
* **OS caching amplifies slowdowns**: When data resides in memory, compression overhead becomes more pronounced. **Recommendation**: Avoid compression for frequently accessed ("hot") data.
@@ -5848,7 +5812,7 @@ Let us see how compression impacts query times if the data is stored on a slower
58485812

58495813
Compression improves performance when large datasets are read from slow storage. Thus, it is **recommended for cold tiers** (rarely accessed data).
58505814

5851-
# Summary
5815+
## Summary
58525816

58535817
For an optimal balance of cost and query performance, we recommend a tiered storage strategy
58545818

@@ -5861,7 +5825,42 @@ For an optimal balance of cost and query performance, we recommend a tiered stor
58615825

58625826
Not all tables require identical partitioning strategies. Frequently accessed tables may remain in the hot tier for extended durations. Conversely, even within a heavily queried table, certain columns might be seldom accessed. In such cases, **symbolic links can be used to migrate column files to the appropriate storage tier**.
58635827

5864-
# Data
5828+
## Infrastructure
5829+
5830+
Tests were conducted on version 9.4 of Red Hat Enterprise Linux using kdb+ 4.1 (version 2025.01.17). Compression performance depends on the **compression library versions**, which are listed below:
5831+
5832+
* `zlib`: 1.2.11
5833+
* `lz4`: 1.9.3
5834+
* `snappy`: 1.1.8
5835+
* `zstd`: 1.5.1
5836+
5837+
Key specifications for the two systems:
5838+
5839+
1. Local block storage and Intel Xeon 6 efficient CPU
5840+
* **Storage**: Intel SSD D7-P5510 (3.84 TB), with interface PCIe 4.0 x4, NVMe
5841+
* **CPU**: Intel(R) Xeon(R) [6780E](https://www.intel.com/content/www/us/en/products/sku/240362/intel-xeon-6780e-processor-108m-cache-2-20-ghz/specifications.html) (**E**fficient series)
5842+
* Sockets: 2
5843+
* Cores per socket: 144
5844+
* Thread(s) per core: 1
5845+
* NUMA nodes: 2
5846+
* filesystem: ext4
5847+
* memory: 502GiB, DDR5 6400 MT/s, 8 channels
5848+
1. NFS storage and Intel Xeon 6 performance CPU
5849+
* **Storage**: NFS (version 4.2), mounted in sync mode, with read and write chunk sizes (`wsize` and `rsize`) 1 MB. NFS cache was not set up, i.e `-o fsc` mount parameter was not set.
5850+
* Some network parameters:
5851+
* MTU: 1500
5852+
* TCP read/write buffer size (`/proc/sys/net/core/rmem_default`, `/proc/sys/net/core/wmem_default`): 212992
5853+
* **CPU**: Intel(R) Xeon(R) [6747P](https://www.intel.com/content/www/us/en/products/sku/241825/intel-xeon-6747p-processor-288m-cache-2-70-ghz/specifications.html) (**P**erformance series)
5854+
* Sockets: 2
5855+
* Cores per socket: 48
5856+
* Thread(s) per core: 2
5857+
* NUMA nodes: 4
5858+
* memory: 502GiB, DDR5 6400 MT/s, 8 channels
5859+
5860+
The tests ran on a single NUMA node, using local node memory only. This is achieved by launching the kdb+ processes with `numactl -N 0 -m 0`.
5861+
5862+
5863+
## Data
58655864

58665865
We used [publicly available](https://ftp.nyse.com/Historical%20Data%20Samples/DAILY%20TAQ/) NYSE TAQ data for this analysis. Tables `quote` and `trade` were generated using the script [taq.k](https://github.com/KxSystems/kdb-taq). Table `quote` had 1.78 billion rows and consumed 180 GB disk space uncompressed. Table `trade` was smaller, contained 76 million rows and required 5.7 GB space. All tables were parted by the instrument ID (column `Symbol`). The data corresponds to a single day in 2022.
58675866

0 commit comments

Comments
 (0)