Description
Issue description
Not sure that it could be considered as an issue, but any help will be appreciated.
We decided to upgrade the ClickHouse client version from v1.5.3 to v2.2.0.
We were hoping that it can improve writing (insertion speed).
We cannot use the latest CH client version (2.6) because of some kind of library incompatibility.
Got this during the building
github.com/Unity-Technologies/mz-user-value/vendor/github.com/segmentio/asm/bswap.swap64: relocation target github.com/segmentio/asm/cpu.X86 not defined
The command '/bin/sh -c go install -v ./...' returned a non-zero code: 2
After upgrading we got the performance degradation, the batch commit latency increased twice:
See the code example how we measured latency.
Also, we got a higher memory usage:
Some details about the use case.
The service is running in GCP on 6 pods of n1-standard-64 machines type.
The service costume data from Kafka and insert into CH.
The CH cluster runs on 4 nodes (2 shards and 2 replicas) of n2-standard-16 machines.
The insertion rate 400k-500k records/sec (could be 600k at peak).
We use batch insertion, the batch size 2M records.
Example code
Client initialization:
conn, err := clickhouse.Open(&clickhouse.Options{
Addr: c.ClickhouseHosts,
Auth: clickhouse.Auth{
Database: c.ClickhouseDatabase,
Username: cr.username,
Password: cr.password,
},
DialTimeout: 10 * time.Second,
MaxOpenConns: 30,
MaxIdleConns: 5,
ConnMaxLifetime: time.Hour,
Compression: &clickhouse.Compression{
Method: clickhouse.CompressionLZ4,
},
Settings: clickhouse.Settings{
"log_queries": 0,
},
})
Insert SQL statement:
insertEventQuery = `INSERT INTO event (
ts,
gamer_id,
game_id,
geo,
spend,
revenue,
requests,
auction_id,
event_type,
ett
)`
The tables:
CREATE TABLE user_value_r.event
(
`ts` DateTime,
`gamer_id` FixedString(24),
`game_id` UInt32,
`geo` LowCardinality(String),
`spend` Float64,
`revenue` Float64,
`requests` UInt64,
`is_blocked` UInt8,
`auction_id` String,
`event_type` LowCardinality(String),
`ett` Array(UInt32)
)
ENGINE = Distributed('arc', 'user_value_r', 'event_local', reinterpretAsUInt64(gamer_id))
CREATE TABLE user_value_r.event_local
(
`ts` DateTime,
`gamer_id` FixedString(24),
`game_id` UInt32,
`geo` LowCardinality(String),
`spend` Float64,
`revenue` Float64,
`requests` UInt64,
`is_blocked` UInt8,
`auction_id` String,
`event_type` LowCardinality(String),
`ett` Array(UInt32)
)
ENGINE = Null
Prepare batch
func (c *ClickhouseWriter) prepareQuery() error {
var err error
c.batchStart = time.Now()
ctx := context.Background()
c.batch, err = c.db.PrepareBatch(ctx, insertEventQuery)
if err != nil {
return errors.Wrapf(err, "clickhousewriter: begin tx")
}
if err != nil {
c.batch.Send()
c.batch = nil
return errors.Wrapf(err, "clickhousewriter: prepare batch")
}
return nil
}
Insert:
...
err = c.batch.Append(
msg.Event.Timestamp,
msg.Event.GamerID,
msg.Event.GameID,
msg.Event.Country,
0.0,
0.0,
uint64(1),
msg.Event.AuctionID,
"a_"+msg.Event.RequestType,
msg.Event.ExperimentTrackingToken,
)
...
Commit (send) measurement code
...
if c.batch != nil {
start := time.Now()
err1 := c.batch.Send()
c.stats.TimingSince(eventCommitLatency, start)
if err1 != nil {
err = errors.Wrapf(err1, "clickhousewriter: tx commit")
}
}
...
Where the batch
is defined as:
import (
...
"github.com/ClickHouse/clickhouse-go/v2/lib/driver"
...
)
...
batch driver.Batch
...
Error log
N/A
Configuration
OS:
Linux mz-user-value-canary-75675fcd7-gk9lm 5.4.170+ #1 SMP Sat Mar 5 10:08:44 PST 2022 x86_64 GNU/Linux
cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
Interface: E.g. native
Driver version: 2.2.0
Go version: run go version
in your console
go version
go version go1.18.9 linux/amd64
ClickHouse Server version: 22.3.8.39