File tree Expand file tree Collapse file tree 3 files changed +12
-2
lines changed
migrator/src/main/scala/com/scylladb/migrator Expand file tree Collapse file tree 3 files changed +12
-2
lines changed Original file line number Diff line number Diff line change @@ -3,6 +3,16 @@ package com.scylladb.migrator.alternator
33import org .apache .spark .util .AccumulatorV2
44import java .util .concurrent .atomic .AtomicReference
55
6+ /**
7+ * Accumulator for tracking processed Parquet file paths during migration.
8+ *
9+ * This accumulator collects the set of Parquet file paths that have been processed
10+ * as part of a migration job. It is useful for monitoring progress, avoiding duplicate
11+ * processing, and debugging migration workflows. The accumulator is thread-safe and
12+ * can be used in distributed Spark jobs.
13+ *
14+ * @param initialValue The initial set of processed file paths (usually empty).
15+ */
616class StringSetAccumulator (initialValue : Set [String ] = Set .empty)
717 extends AccumulatorV2 [String , Set [String ]] {
818
Original file line number Diff line number Diff line change @@ -13,7 +13,7 @@ import scala.collection.concurrent.TrieMap
1313 * partitions and files. When all partitions belonging to a file have been successfully
1414 * completed, it marks the file as processed via the ParquetSavepointsManager.
1515 *
16- * @param partitionToFile Mapping from Spark partition ID to source file paths
16+ * @param partitionToFiles Mapping from Spark partition ID to source file paths
1717 * @param fileToPartitions Mapping from file path to the set of partition IDs reading from it
1818 * @param savepointsManager Manager to notify when files are completed
1919 */
Original file line number Diff line number Diff line change @@ -12,7 +12,7 @@ case class PartitionMetadata(
1212/**
1313 * This reader uses Spark's internal partition information to build mappings
1414 * between partition IDs and file paths. This allows us to track when all
15- * partitions of a file have been processed, enabling file-level savepointse .
15+ * partitions of a file have been processed, enabling file-level savepoints .
1616 */
1717object PartitionMetadataReader {
1818 private val logger = LogManager .getLogger(" com.scylladb.migrator.readers.PartitionMetadataReader" )
You can’t perform that action at this time.
0 commit comments