Skip to content

Display warning when the buildSourceZip zip is too large #94

@convexquad

Description

@convexquad

One common issue is that people's Hadoop zips get large because their embedded sources zip (which is produced by the buildSourceZip task) includes unnecessary binary resources. One of my teammates was accidentally producing a 1 GB sources zip file because he had a binary machine learning model stored in the zip.

Users can easily fix this problem by using the writeScmPluginJson task and adding an exclude. However (even though this is documented on go/HadoopPlugin and at https://github.com/linkedin/linkedin-gradle-plugin-for-apache-hadoop/wiki), they don't realize it.

What we could do is at the end of the buildSourceZip task add a check on the final size of the zip, and if it is bigger than some arbitrary size (perhaps 20 MB is a good size), display a logger.lifecycle message telling users they can use the writeScmPluginJson task to trim their sources zip.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions