Skip to content

Commit 591ab22

Browse files
khoroletskobayurii
andauthored
refactor(database, state-indexer): State schema improvements for reads and updates (#410)
* refactor(database,state-indexer): Introduce compact schema for state_changes that is more efficient to read from * chore(database, logic-state-indexer): Split database/postgres/state_indexer into files. Remove redundant block_hash from handle_state_changes method in logic-state-indexer * chore(database, state-indexer): Add additional metrics to monitor how long state indexer writes take time and how many partitions touched * refactor(database, state-indexer): Add postgresql function to match partition number. Switch CTE to unnest for updates * refactor(database, state-indexer): Replace numberic(20,0) for block_heights to biging (i64) to speed inserts and updates up * refactor(database, rpc-server): Update read queries related to states to use i64 instead of BigDecimal * add migration scripts * add indexes for new tables * start state indexer from interaption block * paginated state optimization * fix paginated state * fix query * fix page_token --------- Co-authored-by: Yurii Koba <[email protected]>
1 parent 723c345 commit 591ab22

24 files changed

+1996
-702
lines changed

Cargo.lock

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

database/Cargo.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,13 @@ sqlx = { version = "0.8.2", features = [
2828
num-bigint = "0.3.3"
2929
num-traits = "0.2.19"
3030
scylla = { version = "0.15.1", features = ["ssl", "full-serialization"] }
31+
tokio = { version = "1.36.0", features = [
32+
"sync",
33+
"time",
34+
"macros",
35+
"rt-multi-thread",
36+
] }
37+
tracing = "0.1.34"
3138

3239
configuration.workspace = true
3340
readnode-primitives.workspace = true
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
> **Note:** If you are starting the project from scratch (with a new, empty database), you do not need to run the migration scripts in this directory. These scripts are only necessary when migrating data from an existing database or upgrading shards. For new deployments, follow the standard database initialization procedures instead.
2+
3+
Database Migration Scripts
4+
5+
This directory contains scripts for migrating database shards and related data.
6+
7+
## Shard Migration Script
8+
9+
The `shard_migration.sh` script is the main migration orchestrator that runs multiple migration scripts in parallel.
10+
11+
### Usage
12+
13+
The script accepts the following command-line arguments:
14+
15+
- `--db_name`: Database name
16+
- `--db_user`: Database username
17+
- `--db_password`: Database password
18+
- `--host`: Database host
19+
- `--port`: Database port
20+
21+
### Examples
22+
23+
#### Basic Example:
24+
```bash
25+
./shard_migration.sh --db_name my_database --db_user postgres --db_password mypassword --host localhost --port 5432
26+
```
27+
28+
#### Local PostgreSQL Database:
29+
```bash
30+
./shard_migration.sh \
31+
--db_name read_rpc_db \
32+
--db_user postgres \
33+
--db_password secretpassword \
34+
--host localhost \
35+
--port 5432
36+
```
37+
38+
#### Remote Database:
39+
```bash
40+
./shard_migration.sh \
41+
--db_name production_db \
42+
--db_user readrpc_user \
43+
--db_password prod_password123 \
44+
--host db.example.com \
45+
--port 5432
46+
```
47+
48+
#### Using Environment Variables:
49+
```bash
50+
# Set environment variables first
51+
export DB_NAME="my_database"
52+
export DB_USER="postgres"
53+
export PGPASSWORD="mypassword"
54+
export DB_HOST="localhost"
55+
export DB_PORT="5432"
56+
57+
# Then run the script (it will use the environment variables)
58+
./shard_migration.sh
59+
```
60+
61+
### Prerequisites
62+
63+
1. **Make the script executable:**
64+
```bash
65+
chmod +x shard_migration.sh
66+
```
67+
68+
2. **Make all migration scripts executable:**
69+
```bash
70+
chmod +x migrate_*.sh
71+
```
72+
73+
3. **Ensure all required migration scripts exist:**
74+
- `migrate_access_keys.sh`
75+
- `migrate_accounts.sh`
76+
- `migrate_contracts.sh`
77+
- `migrate_state_changes.sh`
78+
79+
### How it Works
80+
81+
The `shard_migration.sh` script:
82+
83+
1. Parses command-line arguments and sets environment variables
84+
2. Creates a log file named `migration_${DB_NAME}.log`
85+
3. Runs four migration scripts in parallel using the `&` operator
86+
4. Waits for all migrations to complete using the `wait` command
87+
5. Logs start and completion times
88+
89+
### Output
90+
91+
- Migration progress and results are logged to `migration_${DB_NAME}.log`
92+
- Console output shows start and completion timestamps
93+
- Each individual migration script may produce its own output
94+
95+
### Notes
96+
97+
- All arguments are required for the script to function properly
98+
- The script runs migrations in parallel to improve performance
99+
- Make sure you have proper database permissions before running the migration
100+
- Review the individual migration scripts to understand what data will be migrated
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#!/bin/bash
2+
3+
# Function to migrate a single partition
4+
migrate_partition() {
5+
local partition=$1
6+
# shellcheck disable=SC2155
7+
local start_time=$(date +"%T")
8+
9+
echo "[INFO] Starting migration for partition state_changes_access_key_$partition at $start_time"
10+
echo "[INFO] Starting migration for partition state_changes_access_key_$partition at $start_time" >> "$LOG_FILE"
11+
12+
psql -U "$DB_USER" -d "$DB_NAME" -h "$DB_HOST" -p "$DB_PORT" -c "
13+
WITH ordered_data AS (
14+
SELECT
15+
account_id,
16+
data_key,
17+
data_value,
18+
block_height AS block_height_from,
19+
LAG(block_height) OVER (PARTITION BY account_id, data_key ORDER BY block_height DESC) AS block_height_to
20+
FROM state_changes_access_key_$partition
21+
)
22+
INSERT INTO state_changes_access_key_compact_$partition (account_id, data_key, data_value, block_height_from, block_height_to)
23+
SELECT
24+
account_id,
25+
data_key,
26+
data_value,
27+
block_height_from::bigint,
28+
block_height_to::bigint
29+
FROM ordered_data
30+
WHERE data_value IS NOT NULL
31+
ON CONFLICT (account_id, data_key, block_height_from) DO NOTHING;
32+
" 2>&1 | tee -a "$LOG_FILE"
33+
34+
# shellcheck disable=SC2155
35+
local end_time=$(date +"%T")
36+
echo "[INFO] Finished migration for partition state_changes_access_key_$partition at $end_time"
37+
echo "[INFO] Finished migration for partition state_changes_access_key_$partition at $end_time" >> "$LOG_FILE"
38+
}
39+
40+
# Run migrations in parallel for partitions 0 to 99
41+
for i in $(seq 0 99); do
42+
migrate_partition "$i" &
43+
done
44+
45+
# Wait for all background jobs to finish
46+
wait
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/bin/bash
2+
3+
# Function to migrate a single partition
4+
migrate_partition() {
5+
local partition=$1
6+
# shellcheck disable=SC2155
7+
local start_time=$(date +"%T")
8+
9+
echo "[INFO] Starting migration for partition state_changes_account_$partition at $start_time"
10+
echo "[INFO] Starting migration for partition state_changes_account_$partition at $start_time" >> "$LOG_FILE"
11+
12+
psql -U "$DB_USER" -d "$DB_NAME" -h "$DB_HOST" -p "$DB_PORT" -c "
13+
WITH ordered_data AS (
14+
SELECT
15+
account_id,
16+
data_value,
17+
block_height AS block_height_from,
18+
LAG(block_height) OVER (PARTITION BY account_id ORDER BY block_height DESC) AS block_height_to
19+
FROM state_changes_account_$partition
20+
)
21+
INSERT INTO state_changes_account_compact_$partition (account_id, data_value, block_height_from, block_height_to)
22+
SELECT
23+
account_id,
24+
data_value,
25+
block_height_from::bigint,
26+
block_height_to::bigint
27+
FROM ordered_data
28+
WHERE data_value IS NOT NULL
29+
ON CONFLICT (account_id, block_height_from) DO NOTHING;
30+
" 2>&1 | tee -a "$LOG_FILE"
31+
32+
# shellcheck disable=SC2155
33+
local end_time=$(date +"%T")
34+
echo "[INFO] Finished migration for partition state_changes_account_$partition at $end_time"
35+
echo "[INFO] Finished migration for partition state_changes_account_$partition at $end_time" >> "$LOG_FILE"
36+
}
37+
38+
# Run migrations in parallel for partitions 0 to 99
39+
for i in $(seq 0 99); do
40+
migrate_partition "$i" &
41+
done
42+
43+
# Wait for all background jobs to finish
44+
wait
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
#!/bin/bash
2+
3+
# Function to migrate a single partition
4+
migrate_partition() {
5+
local partition=$1
6+
# shellcheck disable=SC2155
7+
local start_time=$(date +"%T")
8+
9+
echo "[INFO] Starting migration for partition state_changes_contract_$partition at $start_time"
10+
echo "[INFO] Starting migration for partition state_changes_contract_$partition at $start_time" >> "$LOG_FILE"
11+
12+
psql -U "$DB_USER" -d "$DB_NAME" -h "$DB_HOST" -p "$DB_PORT" -c "
13+
WITH ordered_data AS (
14+
SELECT
15+
account_id,
16+
data_value,
17+
block_height AS block_height_from,
18+
LAG(block_height) OVER (PARTITION BY account_id ORDER BY block_height DESC) AS block_height_to
19+
FROM state_changes_contract_$partition
20+
)
21+
INSERT INTO state_changes_contract_compact_$partition (account_id, data_value, block_height_from, block_height_to)
22+
SELECT
23+
account_id,
24+
data_value,
25+
block_height_from::bigint,
26+
block_height_to::bigint
27+
FROM ordered_data
28+
WHERE data_value IS NOT NULL
29+
ON CONFLICT (account_id, block_height_from) DO NOTHING;
30+
" 2>&1 | tee -a "$LOG_FILE"
31+
32+
# shellcheck disable=SC2155
33+
local end_time=$(date +"%T")
34+
echo "[INFO] Finished migration for partition state_changes_contract_$partition at $end_time"
35+
echo "[INFO] Finished migration for partition state_changes_contract_$partition at $end_time" >> "$LOG_FILE"
36+
}
37+
38+
# Run migrations in parallel for partitions 0 to 99
39+
for i in $(seq 0 99); do
40+
migrate_partition "$i" &
41+
done
42+
43+
# Wait for all background jobs to finish
44+
wait
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#!/bin/bash
2+
3+
# Function to migrate a single partition
4+
migrate_partition() {
5+
local partition=$1
6+
# shellcheck disable=SC2155
7+
local start_time=$(date +"%T")
8+
9+
echo "[INFO] Starting migration for partition state_changes_data_$partition at $start_time"
10+
echo "[INFO] Starting migration for partition state_changes_data_$partition at $start_time" >> "$LOG_FILE"
11+
12+
psql -U "$DB_USER" -d "$DB_NAME" -h "$DB_HOST" -p "$DB_PORT" -c "
13+
WITH ordered_data AS (
14+
SELECT
15+
account_id,
16+
data_key,
17+
data_value,
18+
block_height AS block_height_from,
19+
LAG(block_height) OVER (PARTITION BY account_id, data_key ORDER BY block_height DESC) AS block_height_to
20+
FROM state_changes_data_$partition
21+
)
22+
INSERT INTO state_changes_data_compact_$partition (account_id, data_key, data_value, block_height_from, block_height_to)
23+
SELECT
24+
account_id,
25+
data_key,
26+
data_value,
27+
block_height_from::bigint,
28+
block_height_to::bigint
29+
FROM ordered_data
30+
WHERE data_value IS NOT NULL
31+
ON CONFLICT (account_id, data_key, block_height_from) DO NOTHING;
32+
" 2>&1 | tee -a "$LOG_FILE"
33+
34+
# shellcheck disable=SC2155
35+
local end_time=$(date +"%T")
36+
echo "[INFO] Finished migration for partition state_changes_data_$partition at $end_time"
37+
echo "[INFO] Finished migration for partition state_changes_data_$partition at $end_time" >> "$LOG_FILE"
38+
}
39+
40+
# Run migrations in parallel for partitions 0 to 99
41+
for i in $(seq 0 99); do
42+
migrate_partition "$i" &
43+
done
44+
45+
# Wait for all background jobs to finish
46+
wait
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
#!/bin/bash
2+
3+
# Parse arguments or use environment variables
4+
while [[ $# -gt 0 ]]; do
5+
key="$1"
6+
case $key in
7+
--db_name)
8+
export DB_NAME="$2"
9+
shift 2
10+
;;
11+
--db_user)
12+
export DB_USER="$2"
13+
shift 2
14+
;;
15+
--db_password)
16+
export PGPASSWORD="$2"
17+
shift 2
18+
;;
19+
--host)
20+
export DB_HOST="$2"
21+
shift 2
22+
;;
23+
--port)
24+
export DB_PORT="$2"
25+
shift 2
26+
;;
27+
*)
28+
echo "Unknown option: $1"
29+
exit 1
30+
;;
31+
esac
32+
done
33+
34+
# Set defaults from environment if not set by args
35+
: "${DB_NAME:=${DB_NAME}}"
36+
: "${DB_USER:=${DB_USER}}"
37+
: "${PGPASSWORD:=${PGPASSWORD}}"
38+
: "${DB_HOST:=${DB_HOST}}"
39+
: "${DB_PORT:=${DB_PORT}}"
40+
41+
# Check required variables
42+
if [[ -z "$DB_NAME" || -z "$DB_USER" || -z "$PGPASSWORD" || -z "$DB_HOST" || -z "$DB_PORT" ]]; then
43+
echo "All arguments are required: --db_name, --db_user, --db_password, --host, --port (or set corresponding env vars)"
44+
exit 1
45+
fi
46+
47+
48+
# Set log file
49+
export LOG_FILE="migration_${DB_NAME}.log"
50+
# Remove old log file if it exists
51+
rm -f "$LOG_FILE"
52+
touch "$LOG_FILE"
53+
54+
echo "Starting migration at $(date)" | tee -a "$LOG_FILE"
55+
56+
./migrate_access_keys.sh &
57+
./migrate_accounts.sh &
58+
./migrate_contracts.sh &
59+
./migrate_state_changes.sh &
60+
61+
wait
62+
63+
echo "Migration completed at $(date)" | tee -a "$LOG_FILE"

0 commit comments

Comments
 (0)