End-to-end SCT tests that provision a Cassandra source, a Scylla target, and an Amazon EMR cluster, then drive the scylla-migrator job between them. Two variants exist:
- Small (
test_migration_cs_to_scylla_small, ~30 min): 100K rows of simple schema, can be used as PR-gating regression coverage. - Scale (
test_migration_cs_to_scylla_scale, hours): 500GB latte preload (4 × 125 M rows), can be used for weekly throughput + validator coverage.
The test cases are under test-cases/spark-migrator/:
# small (PR-gating)
hydra run-test spark_migrator_test.SparkMigratorTest.test_migration_cs_to_scylla_small \
--backend aws --config test-cases/spark-migrator/cs-to-scylla-small.yaml
# scale (nightly)
hydra run-test spark_migrator_test.SparkMigratorTest.test_migration_cs_to_scylla_scale \
--backend aws --config test-cases/spark-migrator/cs-to-scylla-500gb.yamlBoth yamls set db_type: mixed_cassandra so SCT provisions BOTH a Scylla target
cluster (self.db_cluster) AND a Cassandra source cluster (self.cs_db_cluster)
in a single run, plus an EMR cluster (self.emr_cluster) sized for the migration job.
The 500GB preload takes a long time. To skip preload and rerun the migration + validation
against an already-provisioned cluster, set SCT_REUSE_CLUSTER to the test_id of the
previous run:
SCT_REUSE_CLUSTER=<previous-test-id> hydra run-test ...When SCT_REUSE_CLUSTER is set, test_migration_cs_to_scylla_scale skips the
prepare_write_cmd chunks entirely. The schema apply loop is idempotent
(CREATE TABLE|TYPE|INDEX statements are rewritten to ... IF NOT EXISTS on the fly)
so re-running against pre-existing target tables is safe.
- Small test: post-migration sanity =
_count_rows_parallel(token-range shardedcount(*)) + 10-row sample compare against source. Validator EMR step is off by default (migrator_run_validator: falsein test config). - Scale test:
count(*)+ sampling does not apply — at 500 M rows, shardedcount(*)hits Scylla's server-sideread_request_timeoutregardless of client-side parallelism. Correctness is delegated to the validator EMR step (migrator_run_validator: trueby default in the 500GB test config), which scans both source and target end-to-end and reports per-row mismatches. Tunevalidator_step_timeout_minutes(default 240) when source/target sizes change.
When migrator_run_validator: true (default in the 500GB yaml), a second EMR step runs
scylla-migrator's com.scylladb.migrator.Validator main class against the same
source/target tables after the migration completes.
The EMR step state is the verdict. Per upstream Validator.scala main():
- When zero failures during migration ->
log.info("No comparison failures found - enjoy your day!") - is emitted and step state is COMPLETED.
- One or more failures occurred during migration ->
log.error("Found N comparison failure(s) in sample: <breakdown>"), plus dump ofRowComparisonFailureentries, are emitted and step state is FAILED.
| EMR step state | SCT verdict |
|---|---|
| COMPLETED | pass - test continues |
| FAILED | fail - validator step stdout (including the upstream Found N summary line and per-row RowComparisonFailure records) is downloaded from S3 and dumped to sct.log for investigation, then AssertionError is raised |
emr_spark_migrator_release should be pinned to a known-compatible tag (the 500GB yaml currently
pins v2.0.1); bump deliberately and re-verify the upstream output contract when moving to a
new migrator release.
post_behavior_db_nodescontrols the tear-down policy for BOTH the Scylla target AND the Cassandra source -mixed_cassandradoes NOT have a separatepost_behavior_cs_db_nodesparameter (sdcm/tester.py:3835). Settingkeep-on-failuretherefore retains a 500GB Cassandra source on failure; destroy it manually if the cluster is no longer needed.post_behavior_emr_cluster: destroyis set by default so EMR clusters do not leak; expected hourly cost for the scale yaml is dominated by the EMR core nodes (2 ×m5.2xlarge, ~$0.20/h on-demand at time of writing).- Region coherence is asserted at the start of each test — the EMR cluster must be provisioned in the same region as the SCT clusters, otherwise cross-region data transfer turns the migration into a slow + expensive run. Fix the region in the yaml if the assertion fires.
scylla-migrator v2.0.x does not auto-create CQL tables on the target. Both tests therefore pre-create the target schema on Scylla before submitting the migrator step:
- The small test uses a hardcoded simple schema via
_prepare_target_schema. - The scale test calls
BaseCassandraCluster.dump_schemaon the Cassandra source and applies the resulting CQL on the Scylla target. The dump strips Cassandra-only table options (compression.crc_check_chance,speculative_retry, etc.) so the apply loop is byte-for-byte compatible with Scylla's CQL grammar.
test_migration_from_external_source migrates from a pre-existing Cassandra cluster discovered
by EC2 tags (migrator_source_test_id). When the source cluster lives in a different VPC than
the EMR cluster, security groups must allow inbound CQL (9042) from the EMR worker subnet -
otherwise the migrator step hangs on connection setup. test_migration_cs_to_scylla_* tests
provision both clusters in the same VPC so this concern only applies to the external-source variant.