Skip to content

Pipeline scaling: In progress #1051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 48 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
a25ecb8
upload_download.sh
jakubadamek Mar 20, 2024
e54784e
upload_download.sh
jakubadamek Mar 20, 2024
bb89ed7
chmod utils
jakubadamek Apr 4, 2024
97ebb66
Runs fine with 79 patients
jakubadamek Apr 4, 2024
a751fa9
patient counts
jakubadamek Apr 4, 2024
cccfb35
venv
jakubadamek Apr 4, 2024
a5d39e6
Setting up Alloy DB
jakubadamek Apr 8, 2024
0188ba7
JDBC mode
jakubadamek Apr 11, 2024
e1dc44c
HAPI side fixes
jakubadamek Apr 11, 2024
84c67cc
Write temporary JDBC config for JDBC mode
jakubadamek Apr 12, 2024
125ce82
Add belgium zone
jakubadamek Apr 13, 2024
1fbe479
JDBC mode with CloudSQL Postgres
jakubadamek Apr 15, 2024
503eef7
config.sh
jakubadamek Apr 16, 2024
f38b213
Row count of Parquet tables
jakubadamek Apr 16, 2024
e69f304
Logging of output writing
jakubadamek Apr 19, 2024
db9fca4
unique output dir
jakubadamek Apr 21, 2024
9afa23c
Only pip install if necessary
jakubadamek May 7, 2024
7feecc7
Num workers
jakubadamek May 7, 2024
1e1a985
Use local dir if not dataflow
jakubadamek May 7, 2024
afabd4e
New HAPI VM. Update DB username.
jakubadamek May 8, 2024
292c674
Params for Flink parallelism
jakubadamek May 9, 2024
72567be
Param for Flink parallelism
jakubadamek May 10, 2024
0cb1837
Merge branch 'google:master' into master
jakubadamek Jun 5, 2024
071a0b8
Human time format
jakubadamek Jun 5, 2024
71eebf7
Move 'sleep 1' and kill more processes
jakubadamek Jun 5, 2024
ec260b8
Run multiple benchmarks
jakubadamek Jun 5, 2024
415ef1b
Log even if the process fails
jakubadamek Jun 5, 2024
fe275a5
Finalize the "run multiple" workflow
jakubadamek Jun 5, 2024
cc5b321
Unify the two config.sh files
jakubadamek Jun 5, 2024
bd43397
Check row count into log
jakubadamek Jun 5, 2024
a545e0f
Set job name to avoid Dataflow JDBC mode error
jakubadamek Jun 6, 2024
5af28f2
Move FHIR server to config; change job name
jakubadamek Jun 7, 2024
0cf2cf6
Move VM_INSTANCE to config.sh
jakubadamek Jun 20, 2024
8cf78fc
Add param for batchSize
jakubadamek Jun 23, 2024
65d003d
Implement multiple FHIR_SERVER_URL
jakubadamek Jun 23, 2024
48e55a0
Move runner to the inner most loop
jakubadamek Jun 23, 2024
6882a6c
Fix multiple typo
jakubadamek Jun 23, 2024
1e1fcd0
Add multiple batch sizes
jakubadamek Jun 24, 2024
212820d
ENABLE_SETUP_GOOGLE3
jakubadamek Jun 25, 2024
3cf7880
Turn off fullsearch index
jakubadamek Jul 1, 2024
1065799
Fix typos
jakubadamek Jul 1, 2024
94087c5
Measure row counts
jakubadamek Jul 1, 2024
4db1ff4
Round duration to nearest second
jakubadamek Jul 1, 2024
510e4e5
Create separate nohup files based on date
jakubadamek Jul 2, 2024
5b3f133
Move logs to /tmp and the final results to ~
jakubadamek Jul 5, 2024
98a01c3
Re-throw FHIR client exceptions #1112
jakubadamek Jul 6, 2024
fba3c9a
Make Flink and Dataflow runners multiple
jakubadamek Jul 6, 2024
c07c905
mkdir tmp
jakubadamek Jul 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions performance-tests/scaling/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/config.sh
31 changes: 31 additions & 0 deletions performance-tests/scaling/config.sh.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# 79,768,7885,79370,791562
export MULTIPLE_PATIENTS="79 768 7885"
export MULTIPLE_JDBC_MODE="true false"
# DirectRunner, FlinkRunner or DataflowRunner
export MULTIPLE_FHIR_ETL_RUNNER="FlinkRunner DataflowRunner"
export MUTLIPLE_FHIR_SERVER_URL="http://localhost:8080/fhir"
# The number of resources to fetch in each API call.
export MULTIPLE_BATCH_SIZE="100 300 1000 3000 10000"
# FlinkRunner parallelization
export MULTIPLE_FLINK_PARALLEL=150
# DataflowRunner workers
export MULTIPLE_DATAFLOW_WORKERS=65

#export PATIENTS=7885
#export JDBC_MODE=false
#export FHIR_ETL_RUNNER=DataflowRunner

# alloy or postgres
export DB_TYPE=postgres
export RUNNING_ON_HAPI_VM=false
export FHIR_UPLOADER_CORES=8
export ENABLE_UPLOAD=false
export ENABLE_DOWNLOAD=true
export ENABLE_SETUP_GOOGLE3=true

# pipeline-scaling-1, pipeline-scaling-2, pipeline-scaling-belgium
export POSTGRES_DB_INSTANCE="pipeline-scaling-2"
export PROJECT_ID="fhir-analytics-test"

export FHIR_SERVER_URL="http://localhost:8080/fhir"
export VM_INSTANCE="Name of your FHIR VM instance"
18 changes: 18 additions & 0 deletions performance-tests/scaling/first_time_setup_google3.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
set -e # Fail on errors.
set -x # Show each command.

source "./variables.sh"

GITS_DIR=~/gits
cd $GITS_DIR
[[ -d "fhir-data-pipes" ]] || git clone https://github.com/google/fhir-data-pipes.git
cd fhir-data-pipes

chmod -R 755 ./utils
sudo apt-get -y install maven
sudo apt install npm
mvn clean install -P dataflow-runner,cloudsql-postgres

sudo apt-get install postgresql-client
wget https://storage.googleapis.com/alloydb-auth-proxy/v1.7.1/alloydb-auth-proxy.linux.amd64 -O ~/Downloads/alloydb-auth-proxy
chmod +x ~/Downloads/alloydb-auth-proxy
7 changes: 7 additions & 0 deletions performance-tests/scaling/hapi_port_forward.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
source ./variables.sh
set -o nounset
killall /usr/bin/ssh || true

gcloud compute ssh $VM_INSTANCE --zone $VM_ZONE --project $PROJECT_ID -- -o ProxyCommand='corp-ssh-helper %h %p' \
-NL 8080:localhost:8080 \
-NL 5432:localhost:5432 &
25 changes: 25 additions & 0 deletions performance-tests/scaling/k6/fhir_requests.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
// docker run --net=host --rm -i grafana/k6 run - <fhir_requests.js

import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
vus: 1000,
duration: '60s',
};

export default function () {
const prefix = "http://localhost:8080/fhir"
const res = http.get(`${prefix}/Patient?_count=100`);
check(res, { 'status was 200': (r) => r.status == 200 });
const patientIds = res.json().entry.map(entry => entry.resource.id);
// Step 3: Select Random Patient
const randomIndex = Math.floor(Math.random() * patientIds.length);
const selectedId = patientIds[randomIndex];
http.get(prefix + "/Patient/" + selectedId)
http.get(prefix + "/Encounter?patient=" + selectedId)
http.get(prefix + "/Observation?patient=" + selectedId)
http.get(prefix + "/MedicationRequest?patient=" + selectedId + "&status=active")

sleep(1);
}
61 changes: 61 additions & 0 deletions performance-tests/scaling/run_multiple_benchmarks.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
source config.sh

if [[ -n "$PATIENTS" ]]; then
echo "ERROR: Comment out PATIENTS in config.sh if running multiple"
exit 1
fi
if [[ -n "$JDBC_MODE" ]]; then
echo "ERROR: Comment out JDBC_MODE in config.sh if running multiple"
exit 1
fi
if [[ -n "$FHIR_ETL_RUNNER" ]]; then
echo "ERROR: Comment out FHIR_ETL_RUNNER in config.sh if running multiple"
exit 1
fi
if [[ -n "$FHIR_SERVER_URL" ]]; then
echo "ERROR: Comment out FHIR_SERVER_URL in config.sh if running multiple"
exit 1
fi
if [[ -n "$BATCH_SIZE" ]]; then
echo "ERROR: Comment out BATCH_SIZE in config.sh if running multiple"
exit 1
fi
if [[ -n "$FLINK_PARALLEL" ]]; then
echo "ERROR: Comment out FLINK_PARALLEL in config.sh if running multiple"
exit 1
fi
if [[ -n "$DATAFLOW_WORKERS" ]]; then
echo "ERROR: Comment out DATAFLOW_WORKERS in config.sh if running multiple"
exit 1
fi

set -e # Fail on errors.
set -x # Show each command.
set -o nounset

for dataflow in $MULTIPLE_DATAFLOW_WORKERS; do
for flink in $MULTIPLE_FLINK_PARALLEL; do
for batch in $MULTIPLE_BATCH_SIZE; do
for p in $MULTIPLE_PATIENTS; do
for j in $MULTIPLE_JDBC_MODE; do
for server in $MULTIPLE_FHIR_SERVER_URL; do
for f in $MULTIPLE_FHIR_ETL_RUNNER; do
export PATIENTS=$p
export JDBC_MODE=$j
export FHIR_ETL_RUNNER=$f
export FHIR_SERVER_URL=$server
export BATCH_SIZE=$batch
export FLINK_PARALLEL=$flink
export DATAFLOW_WORKERS=$dataflow
if [ "$ENABLE_SETUP_GOOGLE3" = true ]; then
./setup_google3.sh
sleep 15
fi
./upload_download.sh
done
done
done
done
done
done
done
58 changes: 58 additions & 0 deletions performance-tests/scaling/setup_google3.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
source ./variables.sh

set -e # Fail on errors.
set -x # Show each command.
set -o nounset

# Kill the current HAPI server to allow to delete and set the database.
"${RUN_ON_HAPI_STANZA[@]}" "sudo killall /usr/bin/java || true"
"${RUN_ON_HAPI_STANZA[@]}" "sudo killall python || true"
"${RUN_ON_HAPI_STANZA[@]}" "sudo killall python3 || true"
"${RUN_ON_HAPI_STANZA[@]}" "sudo killall sh || true"

case "$DB_TYPE" in
"alloy")
ALLOY_INSTANCE="projects/fhir-analytics-test/locations/us-central1/clusters/pipeline-scaling-alloydb-1/instances/pipeline-scaling-alloydb-largest"
sudo killall alloydb-auth-proxy || true
nohup ~/Downloads/alloydb-auth-proxy $ALLOY_INSTANCE &
sleep 1
if [[ "$ENABLE_UPLOAD" = true ]]; then
for cmd in "DROP DATABASE IF EXISTS" "CREATE DATABASE"; do
PGPASSWORD="$DB_PASSWORD" psql -h 127.0.0.1 -p 5432 -U "$DB_USERNAME" -c "$cmd $DB_PATIENTS"
done
else
# Check DB connection.
PGPASSWORD="$DB_PASSWORD" psql -h 127.0.0.1 -p 5432 -U "$DB_USERNAME" -c "SELECT 1"
fi
DB_CONNECTION="jdbc:postgresql:///${DB_PATIENTS}?127.0.0.1:5432"
;;
"postgres")
if [[ "$ENABLE_UPLOAD" = true ]]; then
gcloud sql databases delete "$DB_PATIENTS" --instance="$POSTGRES_DB_INSTANCE" --quiet || true
gcloud sql databases create "$DB_PATIENTS" --instance="$POSTGRES_DB_INSTANCE"
fi
DB_CONNECTION="jdbc:postgresql:///${DB_PATIENTS}?cloudSqlInstance=${PROJECT_ID}:${SQL_ZONE}:${POSTGRES_DB_INSTANCE}&socketFactory=com.google.cloud.sql.postgres.SocketFactory"
;;
*)
echo "Invalid DB_TYPE $DB_TYPE"
;;
esac

# shellcheck disable=SC2088
APPLICATION_YAML="~/gits/hapi-fhir-jpaserver-starter/src/main/resources/application.yaml"

# Update the DB connection config.
"${RUN_ON_HAPI_STANZA[@]}" "sed -i '/.*url: jdbc:postgresql:.*/c\\ url: ${DB_CONNECTION}' $APPLICATION_YAML"
"${RUN_ON_HAPI_STANZA[@]}" "sed -i '/ username: .*/c\\ username: ${DB_USERNAME}' $APPLICATION_YAML"
# Turn off search index because we don't use it and it might conflict between load-balanced HAPI servers.
# Reference: https://hapifhir.io/hapi-fhir/docs/server_jpa/elastic.html
"${RUN_ON_HAPI_STANZA[@]}" "sed -i '/ hibernate.search.enabled: true/c\\ hibernate.search.enabled: false' $APPLICATION_YAML"
# Start the HAPI server.
# shellcheck disable=SC2088
nohup "${RUN_ON_HAPI_STANZA[@]}" "~/gits/fhir-data-pipes/performance-tests/scaling/start_hapi_server.sh" >> "$TMP_DIR/nohup-hapi-$(date +%Y-%m-%d).out" 2>&1 &

if [ "$RUNNING_ON_HAPI_VM" = false ]; then
(sleep 15; "$DIR_WITH_THIS_SCRIPT/hapi_port_forward.sh") &
fi

# tail -F ~/nohup-hapi.out
6 changes: 6 additions & 0 deletions performance-tests/scaling/ssh_to_hapi.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
source ./variables.sh

set -e # Fail on errors.
set -x # Show each command.

gcloud compute ssh $VM_INSTANCE --zone $VM_ZONE --project $PROJECT_ID -- -o ProxyCommand='corp-ssh-helper %h %p'
6 changes: 6 additions & 0 deletions performance-tests/scaling/start_hapi_server.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
set -e # Fail on errors.
set -x # Show each command.

cd ~/gits/hapi-fhir-jpaserver-starter
export PATH=$PATH:~/Downloads/apache-maven-3.9.6/bin
mvn spring-boot:run -Pboot,cloudsql-postgres
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"Modules":[{"Key":"","Source":"","Dir":"."}]}
3 changes: 3 additions & 0 deletions performance-tests/scaling/terraform/.terraform/plugin_path
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[
"/google/data/ro/teams/terraform/terraform_mpm/terraform_mpm.mpm/versions/1-d91dad6b_7ec29629_1bd5e9f0_59275854_e6928195"
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"registry.terraform.io/hashicorp/google": {
"hash": "h1:VDpoLyNgMRdzdXjflbsmdjnYkDn8Q0Z5pWupwAgD+ZI=",
"version": "3.46.0"
}
}
Empty file.
98 changes: 98 additions & 0 deletions performance-tests/scaling/terraform/posgres.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
module "pg" {
source = "terraform-google-modules/sql-db/google//modules/postgresql"
version = "~> 20.0"

name = var.pg_ha_name
random_instance_name = true
project_id = var.project_id
database_version = "POSTGRES_9_6"
region = "us-central1"

// Master configurations
tier = "db-custom-1-3840"
zone = "us-central1-c"
availability_type = "REGIONAL"
maintenance_window_day = 7
maintenance_window_hour = 12
maintenance_window_update_track = "stable"

deletion_protection = false

database_flags = [{ name = "autovacuum", value = "off" }]

user_labels = {
foo = "bar"
}

ip_configuration = {
ipv4_enabled = true
require_ssl = true
private_network = null
allocated_ip_range = null
authorized_networks = [
{
name = "${var.project_id}-cidr"
value = var.pg_ha_external_ip_range
},
]
}

backup_configuration = {
enabled = true
start_time = "20:55"
location = null
point_in_time_recovery_enabled = false
transaction_log_retention_days = null
retained_backups = 365
retention_unit = "COUNT"
}

// Read replica configurations
read_replica_name_suffix = "-test-ha"
read_replicas = [
{
name = "0"
zone = "us-central1-a"
availability_type = "REGIONAL"
tier = "db-custom-1-3840"
ip_configuration = local.read_replica_ip_configuration
database_flags = [{ name = "autovacuum", value = "off" }]
disk_autoresize = null
disk_autoresize_limit = null
disk_size = null
disk_type = "PD_HDD"
user_labels = { bar = "baz" }
encryption_key_name = null
},
]

db_name = var.pg_ha_name
db_charset = "UTF8"
db_collation = "en_US.UTF8"

additional_databases = [
{
name = "${var.pg_ha_name}-additional"
charset = "UTF8"
collation = "en_US.UTF8"
},
]

user_name = "tftest"
user_password = "foobar"

additional_users = [
{
name = "tftest2"
password = "abcdefg"
host = "localhost"
random_password = false
},
{
name = "tftest3"
password = "abcdefg"
host = "localhost"
random_password = false
},
]
}
4 changes: 4 additions & 0 deletions performance-tests/scaling/terraform/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
variable "project_id" {
description = "ID of the GCP project"
type = string
}
Loading
Loading