- Do you want to request a feature or report a bug?
Feature.
- What is the current behavior?
Binary data stored in AFS are compressed and uncompressed automatically by several components.
The Cassandra based implementation :
- automatically gzips chunks of data on write
- automatically gunzips chunks of data on read
The remote implementation :
- on write, automatically gzips data on the client side
- on write, automatically gunzips data on the server side
- on read, automatically gzips data on the server side
- on read, automatically gunzips data on the client side
In case we want to read or write already compressed data, those steps are unnecessary and can hurt performance (and possibly memory usage).
- What is the expected behavior?
If we could set up those components to not perform compression, it could improve performance (to be measured).
- What is the motivation / use case for changing the behavior?
Performance optimization in a typical setup with a client connected to an AFS server, which itself relies on a Cassandra implementation of AFS.
In this kind of setup, when writing/reading data blobs, it is unnecessarily compressed and uncompressed on the server side.
Some benchmarking with JMH show that compressing a large XIIDM case (100 Mb) takes around 2s on my laptop CPU.
With the reception of around 50 cases per hour, it means 1-2 minutes of CPU time consumed for this every hour.
Feature.
Binary data stored in AFS are compressed and uncompressed automatically by several components.
The Cassandra based implementation :
The remote implementation :
In case we want to read or write already compressed data, those steps are unnecessary and can hurt performance (and possibly memory usage).
If we could set up those components to not perform compression, it could improve performance (to be measured).
Performance optimization in a typical setup with a client connected to an AFS server, which itself relies on a Cassandra implementation of AFS.
In this kind of setup, when writing/reading data blobs, it is unnecessarily compressed and uncompressed on the server side.
Some benchmarking with JMH show that compressing a large XIIDM case (100 Mb) takes around 2s on my laptop CPU.
With the reception of around 50 cases per hour, it means 1-2 minutes of CPU time consumed for this every hour.