Skip to content

ci: Add cluster test for vector search #61009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 55 commits into from
Jun 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
eae926f
add cluster test for vector search
EricZequan May 8, 2025
5210f2f
update
EricZequan May 8, 2025
53ac07e
update
EricZequan May 8, 2025
b225211
update
EricZequan May 12, 2025
6a5f858
update
EricZequan May 12, 2025
6a929b5
update
EricZequan May 15, 2025
fd1f3ae
update
EricZequan May 15, 2025
3d1748e
update
EricZequan May 15, 2025
d076707
update
EricZequan May 15, 2025
3f03c5b
update
EricZequan May 15, 2025
f1dacce
update
EricZequan May 15, 2025
105648f
update
EricZequan May 16, 2025
dc59010
update
EricZequan May 16, 2025
c2823f6
update
EricZequan May 16, 2025
66880d2
update
EricZequan May 16, 2025
448b3a5
update
EricZequan May 16, 2025
4195ade
update
EricZequan May 16, 2025
e3eeb0f
update
EricZequan May 16, 2025
e8bd420
update
EricZequan May 16, 2025
129983f
update
EricZequan May 19, 2025
cdf57bc
update
EricZequan May 19, 2025
46151fb
update
EricZequan May 19, 2025
ce19b1e
update
EricZequan May 19, 2025
16f74a8
Merge remote-tracking branch 'upstream/master' into zequan/pick-clust…
EricZequan May 19, 2025
0db71cf
update
EricZequan May 19, 2025
3d85eb8
update
EricZequan May 19, 2025
2432d90
update
EricZequan May 19, 2025
cce01ba
update
EricZequan May 19, 2025
4ed6918
add config.toml to keep test stability
EricZequan May 20, 2025
d597ad7
update name
EricZequan May 20, 2025
e513906
update run_python_tester.sh
EricZequan May 20, 2025
43c8a0f
use pip3
EricZequan May 20, 2025
0afa477
modify pip3 to pip
EricZequan May 21, 2025
490fae1
update python_tester.sh
EricZequan May 21, 2025
d3b264a
modify vector_recall.py to use select *
EricZequan May 21, 2025
48df224
modify vector_recall.py to use select *
EricZequan May 21, 2025
8452efa
update python_tester.sh
EricZequan May 21, 2025
d3fda54
add stop_tiup() time to wait tiup quit
EricZequan May 21, 2025
060db56
update .sh to quit tiup when finish test
EricZequan May 21, 2025
d9ce5f5
chmod +x to upgrade_test.sh
EricZequan May 22, 2025
ddc45b0
update
EricZequan May 22, 2025
f2c94b0
update upgrade test
EricZequan May 22, 2025
94bda90
update upgrade test
EricZequan May 22, 2025
82b67dd
reduce test time
EricZequan May 23, 2025
7a5d606
update
EricZequan May 23, 2025
ffc0fd9
update
EricZequan May 23, 2025
01e001c
update
EricZequan May 23, 2025
7f12407
update
EricZequan May 23, 2025
c621338
update
EricZequan May 23, 2025
43d80dd
update
EricZequan May 26, 2025
7457663
update start_tidb
EricZequan May 28, 2025
28b0b4f
update start_tidb
EricZequan May 28, 2025
7c8e522
Merge remote-tracking branch 'upstream/master' into zequan/pick-clust…
EricZequan May 28, 2025
276e18c
update
EricZequan May 28, 2025
00354bc
update readme
EricZequan May 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions tests/clusterintegrationtest/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/tidb-server
/gobin/
/.venv/
100 changes: 100 additions & 0 deletions tests/clusterintegrationtest/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Cluster Integration Test

Before running the tests, please install tiup and build tidb binary:
```shell
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
source .bash_profile
tiup --version

# cd tidb
make
```

## Guide: Run tests

```shell
# cd clusterintegrationtest
./run_mysql_tester.sh # mysql-tester test

# vector python testers
python3 -m pip install uv
uv init --python python3.9
uv venv
uv pip install -r requirements.txt
# prepare datasets
cd datasets
wget https://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5
wget https://ann-benchmarks.com/mnist-784-euclidean.hdf5
cd ..
./run_python_testers.sh

./run_upgrade_test.sh # upgrade cluster test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to run upgrade test concurrently with non-upgrade test, just like in TiDB-CSE, in order to reduce total test time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/tidbcloud/tidb-cse/blob/release-7.5-keyspace/tests/clusterintegrationtest/docker_scripts/run3_upgradeTest.sh
Are they not running at the same time on cse? Can I run it on the old version first and then start the new version and run it again?🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run1, run2 and run3 are 3 different tasks in CSE so that they are run concurrently, in order to reduce total test time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to provide a separate environment for each of the three tests, because we are starting it through tiup, and multiple clusters will use different ports. This may cause related tests to correspond to different clusters.

```

## Guide: Update `r`

After changing `t` or changing optimizer plans, `r` need to be updated.

1. Start an empty cluster and expose TiDB as :4000

```shell
# cd clusterintegrationtest
./cluster.sh
```

Note: You may need to wait about 30s for TiFlash to be ready.

2. Run following commands

```shell
# cd clusterintegrationtest
GOBIN=$(realpath .)/gobin go install github.com/pingcap/mysql-tester/src@314107b26aa8fce86beb0dd48e75827fb269b365
./gobin/src -retry-connection-count 5 -record
```

## Guide: Develop python_testers

1. Prepare local environment

```shell
# cd clusterintegrationtest
python3 -m pip install uv
uv init --python python3.9
uv venv
uv pip install -r requirements.txt
```

2. Download datasets

```shell
# cd clusterintegrationtest
cd datasets
wget https://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5
wget https://ann-benchmarks.com/mnist-784-euclidean.hdf5
cd ..
```

3. Start a CSE cluster and expose TiDB as :4000

```shell
# cd clusterintegrationtest
./cluster.sh
```

Note: You may need to wait about 30s for TiFlash to be ready.

4. Run, edit and debug tests

```shell
# cd clusterintegrationtest
uv run python_testers/vector_recall.py
```

## Note:
If your contribution involves tidb and tiflash and will affect the test cases of this test, please submit your contribution to tiflash first and wait 2 hours after the merge before executing this test.


1. In tidb, we will download the binary of the tiflash master branch as a component for cluster testing. Please make sure that your tiflash code is submitted to the master branch.
2. In tiflash, we will also download the binary of the tidb master branch as a component for cluster testing. Please make sure that your tidb code is submitted to the master branch.

If you have other questions about this test, please contact @EricZequan.
165 changes: 165 additions & 0 deletions tests/clusterintegrationtest/_include.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
#!/bin/bash
#
# Copyright 2025 PingCAP, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

set -euo pipefail

function start_tidb() {
export VERSION_SOURCE="nightly"

if [ ! -f "../../bin/tidb-server" ]; then
cd ../../ || exit 1
echo "building tidb-server..."
make
echo "build successfully"
cd - || exit 1
fi

echo "Starting TiUP Playground in the background..."
if [ -f "../../bin/tikv-server" ] && [ -f "../../bin/pd-server" ] && [ -f "../../bin/tiflash" ]; then
tiup playground nightly --mode=tidb \
--db.binpath=../../bin/tidb-server \
--db.config=./config.toml \
--kv.binpath=../../bin/tikv-server \
--pd.binpath=../../bin/pd-server \
--tiflash.binpath=../../bin/tiflash &
else
tiup playground nightly --db=1 --kv=1 --tiflash=1 --db.binpath=../../bin/tidb-server --db.config=./config.toml &
fi
}

function check_and_prepare_datasets() {
if [ -f "./fashion-mnist-784-euclidean.hdf5" ] && [ -f "./mnist-784-euclidean.hdf5" ]; then
echo "Datasets already exist, skip"
return
fi

if [ -d "${ASSETS_DIR}" ]; then
if [ -f "${ASSETS_DIR}/fashion-mnist-784-euclidean.hdf5" ]; then
echo "Moving fashion-mnist dataset from ASSETS_DIR..."
mv "${ASSETS_DIR}/fashion-mnist-784-euclidean.hdf5" .
else
echo "Downloading fashion-mnist dataset..."
wget https://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5
fi

if [ -f "${ASSETS_DIR}/mnist-784-euclidean.hdf5" ]; then
echo "Moving mnist dataset from ASSETS_DIR..."
mv "${ASSETS_DIR}/mnist-784-euclidean.hdf5" .
else
echo "Downloading mnist dataset..."
wget https://ann-benchmarks.com/mnist-784-euclidean.hdf5
fi
else
echo "Downloading fashion-mnist dataset..."
wget https://ann-benchmarks.com/fashion-mnist-784-euclidean.hdf5

echo "Downloading mnist dataset..."
wget https://ann-benchmarks.com/mnist-784-euclidean.hdf5

fi
}

function start_tidb_fixed_version() {
export VERSION_SOURCE="v8.5.1"

echo "Starting TiUP Playground in the background..."
tiup playground v8.5.1 --db=1 --kv=1 --tiflash=1 --db.config=./config.toml &
}

function build_mysql_tester() {
echo "+ Installing mysql-tester"
GOBIN=$PWD go install github.com/pingcap/mysql-tester/src@0d83955ea569706e5296cd3e2f54efb7f1206d0b
mv src mysql-tester
}

function wait_for_tidb() {
echo
echo "+ Waiting TiDB start up"

for i in {1..30}; do
if mysql -e 'show databases' -u root -h 127.0.0.1 --port 4000; then
echo " - TiDB startup successfully"
return
fi
sleep 3
done
echo "* Fail to start TiDB cluster in 900s"
exit 1
}

function wait_for_tiflash() {
echo
echo "+ Waiting TiFlash start up (30s)"
sleep 30
}

function stop_tiup() {
echo "+ Stopping TiUP"
TIUP_PID=$(pgrep -f "tiup-playground")
if [ -n "$TIUP_PID" ]; then
echo " - Sending SIGTERM to PID=$TIUP_PID"
kill $TIUP_PID
fi

for i in {1..60}; do
if ! pgrep -f "tiup-playground" > /dev/null; then
echo " - TiUP stopped successfully"
return
fi
sleep 1
done

echo "* Fail to stop TiUP in 60s"
exit 1
}

function print_versions() {
# Print versions
if [ "$VERSION_SOURCE" = "nightly" ]; then
echo "+ TiDB Version"
../../bin/tidb-server -V
echo
if [ -f "../../bin/tikv-server" ] && [ -f "../../bin/pd-server" ] && [ -f "../../bin/tiflash" ]; then
echo "+ TiKV Version"
../../bin/tikv-server --version
echo
echo "+ TiFlash Version"
../../bin/tiflash --version
echo
else
echo "+ TiKV Version"
tiup tikv:nightly --version
echo
echo "+ TiFlash Version"
tiup tiflash:nightly --version
echo
fi
else
echo "+ TiDB Version"
tiup tidb:v8.5.1 -V
echo
echo "+ TiKV Version"
tiup tikv:v8.5.1 --version
echo
Comment on lines +152 to +156
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it should be hard coded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the upgrade test, we will use the v8.5.1 cluster. This cluster has been cached, so we can directly output its version.

echo "+ TiFlash Version"
tiup tiflash:v8.5.1 --version
echo
fi

echo "+ TiUP Version"
~/.tiup/bin/tiup playground -v

}
25 changes: 25 additions & 0 deletions tests/clusterintegrationtest/cluster.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/bin/bash
#
# Copyright 2025 PingCAP, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

cd ../../ || exit 1
echo "building tidb-server..."
make
echo "build successfully"

cd - || exit 1

echo "Starting TiUP Playground in the background..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nightly tiup package has hours latency, how to test it with refreshed tikv and tiflash binaries?

and how to to test it with next-gen tidb binaries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as it is nightly, it is fine, because we are mainly testing the impact of tidb commit on vector, and we only need to keep tidb up to date.
In addition, how to use next-gen tidb is being discussed with @purelind

tiup playground nightly --db=1 --kv=1 --tiflash=1 --db.binpath=../../bin/tidb-server --db.config=./config.toml
32 changes: 32 additions & 0 deletions tests/clusterintegrationtest/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Copyright 2025 PingCAP, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

lease = "0"
host = "127.0.0.1"
new_collations_enabled_on_first_bootstrap = true
enable-table-lock = true

[status]
status-host = "127.0.0.1"

[performance]
stats-lease = "0"

[tikv-client.async-commit]
safe-window = 0
allowed-clock-drift = 0

[experimental]
enable-new-charset = true
allow-expression-index = true
13 changes: 13 additions & 0 deletions tests/clusterintegrationtest/datasets/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright 2025 PingCAP, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading