* Investigate specialized email chunkers * Optimize protocol buffer messages to isolate parts of emails * Investigate compression, notably zstd * Find a (legally available) public archive of emails * Ensure consistency of hash generation across a variety of nodes