-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit for gcsync command line tool #796
base: main
Are you sure you want to change the base?
Conversation
|
||
public static final String INSTRUCTION_FILE_SUFFIX = "instruction"; | ||
|
||
public static final String TMP_FILE_SUFFIX = "updated"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider adding an enum instead of the *_FILE_SUFFIX
constants.
|
||
public static final String FILES_TO_RSYNC_FILE_NAME = "filesToRsync.txt"; | ||
|
||
public static final String JAR_FILE_NAME = "gcsync-all.jar"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this and other constants which are only used in one file, I'd move it to that file.
private GcsyncClient clientUnderTest; | ||
|
||
// For test simplicity, define a smaller threshold for "rsync" logic | ||
private static final long RSYNC_SIZE_THRESHOLD = 1024; // e.g. 1KB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the size threshold could be changed to be a parameter of GsyncClient
instead of a constant?
This way we can initialize the test version with a different value rather than using different versions of the constant.
copyStart = -1; | ||
copyLength = 0; | ||
} | ||
this.sourceStream = in.openStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work to move this openStream
call outside the constructor?
} | ||
|
||
private void flushCopy() throws IOException { | ||
if (copyStart == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the value -1
mean for copyStart
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means copyStart is unset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this, e.g.
...
// the start index for copy() if >= 0, or -1 if it is unset
@CheckForSigned
private int copyStart = -1
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Publishing comments.
@@ -76,7 +76,7 @@ public void generate(Consumer<? super Instruction> out, ByteSource in, | |||
if (cc != null) { | |||
HashCode strongHashCode = rollingChecksum.getStrongHashCode(); | |||
for (Checksum c : cc) { | |||
if (c.getStrongChecksum() == strongHashCode.asLong()) { | |||
if (c.getStrongChecksum().equals(strongHashCode.toString())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compute toString() once outside the loop.
Or compare the asBytes() values using Arrays.equals()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
copyLength = 0; | ||
} | ||
|
||
private void copy(long copyLength) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, I think, is ByteStreams.copy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, instead of setting limit here, I find it easier to just implement it which is pretty straightforward.
} | ||
} | ||
} | ||
|
||
private void skip(long length) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumption here: The fastest way to skip is to read/skip bytes or call skip().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably yes in the case of GCS. Guava's implementation of SlicedByteSource use skip() too.
dbsync/common/src/main/java/com/google/edwmigration/dbsync/common/InstructionReceiver.java
Outdated
Show resolved
Hide resolved
@@ -13,7 +13,7 @@ message Checksum { | |||
uint64 blockOffset = 1; | |||
uint32 blockLength = 2; | |||
int32 weakChecksum = 3; | |||
int64 strongChecksum = 4; | |||
string strongChecksum = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be safer as bytes - although bytes are really expensive in protobuf, so it might as well be string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can leave it as string.
writer.write(file.getFileName().toString()); | ||
writer.newLine(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files.writeLines()
CharSink.writeLines is probably easier.
Justification for attempting to avoid a temp file is every so often some customer has java.io.tmpdir misconfigured and this breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for (Path file : filesToRsync) { | ||
Path tmpCheckSumFile = getTemporaryCheckSumFileName(file); | ||
try (OutputStream instructionSink = gcsStorage.newByteSink( | ||
new URI(tmpBucket).resolve(Util.getInstructionFileName(file.getFileName().toString()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is getTemporaryChecksumFileName here, but getInstructionFileName in util?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
instruction.writeDelimitedTo(instructionSink); | ||
} catch (IOException e) { | ||
throw new RuntimeException(e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be tempted to modify generate() to permit the throwing of IOException, and use a custom class instead of Consumer.
interface InstructionConsumer<X extends Exception> { public void accept(Instruction i) throws X; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
checksum.writeDelimitedTo(bufferedOutputStream); | ||
} catch (IOException e) { | ||
throw new RuntimeException("Failed to generate checksum", e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again the typed-exception-in-interface trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Implementation of gcsync command
Fix of an issue in the core rsync algorithm
Replace the use of slice()
#. Use full 256 bits checksum
Unit test for gcsyncClient