Skip to content

Commit ac830b4

Browse files
authored
tiproxy: add performance report of traffic capture (pingcap#19335)
1 parent 9b42249 commit ac830b4

File tree

2 files changed

+75
-2
lines changed

2 files changed

+75
-2
lines changed

tiproxy/tiproxy-performance-test.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ The results are as follows:
1414
- The row number of the query result set has a significant impact on the QPS of TiProxy, and the impact is the same as that of HAProxy.
1515
- The performance of TiProxy increases almost linearly with the number of vCPUs. Therefore, increasing the number of vCPUs can effectively improve the QPS upper limit.
1616
- The number of long connections and the frequency of creating short connections have minimal impact on the QPS of TiProxy.
17+
- The higher the CPU usage of TiProxy, the greater the impact of enabling [traffic capture](/tiproxy/tiproxy-traffic-replay.md) on QPS. When the CPU usage of TiProxy is about 70%, enabling traffic capture leads to approximately 3% decrease in average QPS and 7% decrease in minimum QPS. The latter decrease is caused by periodic QPS drops during traffic file compression.
1718

1819
## Test environment
1920

@@ -312,3 +313,35 @@ sysbench oltp_point_select \
312313
| 100 | 95597 | 0.52 | 0.65 | 330% | 1800% |
313314
| 200 | 94692 | 0.53 | 0.67 | 330% | 1800% |
314315
| 300 | 94102 | 0.53 | 0.68 | 330% | 1900% |
316+
317+
## Traffic capture test
318+
319+
### Test plan
320+
321+
This test aims to evaluate the performance impact of [traffic capture](/tiproxy/tiproxy-traffic-replay.md) on TiProxy. It uses TiProxy v1.3.0 and compares QPS and TiProxy CPU usage with traffic capture enabled and disabled before executing `sysbench` with different concurrency. Due to periodic QPS fluctuations caused by traffic file compression, the test compares both the average and minimum QPS.
322+
323+
Use the following command to perform the test:
324+
325+
```bash
326+
sysbench oltp_read_write \
327+
--threads=$threads \
328+
--time=1200 \
329+
--report-interval=5 \
330+
--rand-type=uniform \
331+
--db-driver=mysql \
332+
--mysql-db=sbtest \
333+
--mysql-host=$host \
334+
--mysql-port=$port \
335+
run --tables=32 --table-size=1000000
336+
```
337+
338+
### Test results
339+
340+
| Connection count | Traffic capture | Avg QPS | Min QPS | Avg latency (ms) | P95 latency (ms) | TiProxy CPU usage |
341+
| - |-----| --- | --- |-----------|-------------|-----------------|
342+
| 20 | Disabled | 27653 | 26999 | 14.46 | 16.12 | 140% |
343+
| 20 | Enabled | 27519 | 26922 | 14.53 | 16.41 | 170% |
344+
| 50 | Disabled | 58014 | 56416 | 17.23 | 20.74 | 270% |
345+
| 50 | Enabled | 56211 | 52236 | 17.79 | 21.89 | 280% |
346+
| 100 | Disabled | 85107 | 84369 | 23.48 | 30.26 | 370% |
347+
| 100 | Enabled | 79819 | 69503 | 25.04 | 31.94 | 380% |

tiproxy/tiproxy-traffic-replay.md

Lines changed: 42 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ Traffic replay is not suitable for the following scenarios:
4343
> - TiProxy captures traffic on all connections, including existing and newly created ones.
4444
> - In TiProxy primary-secondary mode, connect to the primary TiProxy instance.
4545
> - If TiProxy is configured with a virtual IP, it is recommended to connect to the virtual IP address.
46+
> - The higher the CPU usage of TiProxy, the greater the impact of traffic capture on QPS. To reduce the impact on the production cluster, it is recommended to reserve at least 30% of CPU capacity, which results in an approximately 3% decrease in average QPS. For detailed performance data, see [Traffic capture test](/tiproxy/tiproxy-performance-test.md#traffic-capture-test).
47+
> - TiProxy does not automatically delete previous capture files when capturing traffic again. You need to manually delete them.
4648
4749
For example, the following command connects to the TiProxy instance at `10.0.1.10:3080`, captures traffic for one hour, and saves it to the `/tmp/traffic` directory on the TiProxy instance:
4850

@@ -76,7 +78,7 @@ Traffic replay is not suitable for the following scenarios:
7678

7779
5. View the replay report.
7880

79-
After replay completion, the report is stored in the `tiproxy_traffic_report` database on the test cluster. This database contains two tables: `fail` and `other_errors`.
81+
After replay completion, the report is stored in the `tiproxy_traffic_replay` database on the test cluster. This database contains two tables: `fail` and `other_errors`.
8082

8183
The `fail` table stores failed SQL statements, with the following fields:
8284

@@ -89,16 +91,50 @@ Traffic replay is not suitable for the following scenarios:
8991
- `sample_replay_time`: the time when the SQL statement failed during replay. You can use this to view error information in the TiDB log file.
9092
- `count`: the number of times the SQL statement failed.
9193

94+
The following is an example output of the `fail` table:
95+
96+
```sql
97+
SELECT * FROM tiproxy_traffic_replay.fail LIMIT 1\G
98+
```
99+
100+
```
101+
*************************** 1. row ***************************
102+
cmd_type: StmtExecute
103+
digest: 89c5c505772b8b7e8d5d1eb49f4d47ed914daa2663ed24a85f762daa3cdff43c
104+
sample_stmt: INSERT INTO new_order (no_o_id, no_d_id, no_w_id) VALUES (?, ?, ?) params=[3077 6 1]
105+
sample_err_msg: ERROR 1062 (23000): Duplicate entry '1-6-3077' for key 'new_order.PRIMARY'
106+
sample_conn_id: 1356
107+
sample_capture_time: 2024-10-17 12:59:15
108+
sample_replay_time: 2024-10-17 13:05:05
109+
count: 4
110+
```
111+
92112
The `other_errors` table stores unexpected errors, such as network errors or database connection errors, with the following fields:
93113

94114
- `err_type`: the type of error, presented as a brief error message. For example, `i/o timeout`.
95115
- `sample_err_msg`: the complete error message when the error first occurred.
96116
- `sample_replay_time`: the time when the error occurred during replay. You can use this to view error information in the TiDB log file.
97117
- `count`: the number of occurrences for this error.
98118

119+
The following is an example output of the `other_errors` table:
120+
121+
```sql
122+
SELECT * FROM tiproxy_traffic_replay.other_errors LIMIT 1\G
123+
```
124+
125+
```
126+
*************************** 1. row ***************************
127+
err_type: failed to read the connection: EOF
128+
sample_err_msg: this is an error from the backend connection: failed to read the connection: EOF
129+
sample_replay_time: 2024-10-17 12:57:39
130+
count: 1
131+
```
132+
99133
> **Note:**
100134
>
101-
> The table schema of `tiproxy_traffic_report` might change in future versions. It is not recommended to directly read data from `tiproxy_traffic_report` in your application or tool development.
135+
> - The table schema of `tiproxy_traffic_replay` might change in future versions. It is not recommended to directly read data from `tiproxy_traffic_replay` in your application or tool development.
136+
> - Replay does not guarantee that the transaction execution order between connections exactly matches the capture sequence. This might lead to incorrect error reports.
137+
> - TiProxy does not automatically delete the previous replay report when replaying traffic. You need to manually delete it.
102138

103139
## Test throughput
104140

@@ -151,3 +187,7 @@ For more information, see [`tiproxyctl traffic cancel`](/tiproxy/tiproxy-command
151187
- TiProxy traffic replay does not support filtering SQL types and DML and DDL statements are replayed. Therefore, you need to restore the cluster data to its pre-replay state before replaying again.
152188
- TiProxy traffic replay does not support testing [Resource Control](/tidb-resource-control.md) and [privilege management](/privilege-management.md) because TiProxy uses the same username to replay traffic.
153189
- TiProxy does not support replaying [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) statements.
190+
191+
## More resources
192+
193+
For more information about the traffic replay of TiProxy, see the [design document](https://github.com/pingcap/tiproxy/blob/main/docs/design/2024-08-27-traffic-replay.md).

0 commit comments

Comments
 (0)