Open
Description
Recently, Scylla merged scylladb/scylladb#21002.
We should use it for sstable deduplication instead of the currently used generation ID approach, as it has the following benefits:
- it is resilient to sstable migration - meaning that sstable identifier stays the same after sstable migration (not the case for generation ID)
- it is safer to use than deduplicating sstables with int based generaion IDs by their name/size/.crc32
The second argument is self explanatory.
In terms of the first one, we would need to create a design doc specifying how would the deduplication/upload handle the case when an sstable is already present in the backup location, but with different ID and under a different node path.