-
Notifications
You must be signed in to change notification settings - Fork 76
Description
One common issue is that people's Hadoop zips get large because their embedded sources zip (which is produced by the buildSourceZip task) includes unnecessary binary resources. One of my teammates was accidentally producing a 1 GB sources zip file because he had a binary machine learning model stored in the zip.
Users can easily fix this problem by using the writeScmPluginJson task and adding an exclude. However (even though this is documented on go/HadoopPlugin and at https://github.com/linkedin/linkedin-gradle-plugin-for-apache-hadoop/wiki), they don't realize it.
What we could do is at the end of the buildSourceZip task add a check on the final size of the zip, and if it is bigger than some arbitrary size (perhaps 20 MB is a good size), display a logger.lifecycle message telling users they can use the writeScmPluginJson task to trim their sources zip.