Update CI vulnerability workflow to reduce how often the docker image is built#196
Update CI vulnerability workflow to reduce how often the docker image is built#196
Conversation
while informative, we aren't taking any action based on the results
the underlying repo redirects to 'wardencommunity/warden' and is not on rubydoc.info
lorenyu
left a comment
There was a problem hiding this comment.
Thanks! This will be a great improvement. I have some design feedback. I think we should explore making this a composite action instead of a job, and clean up the ordering of the steps to be more streamlined (check for the image first, then restore the buildx layers). Also found a bug where the buildx layers aren't caching properly, (you need to modify the release-build make command to support OPTS)
| OPTIONAL_BUILD_FLAGS=" \ | ||
| --cache-from=type=local,src=/tmp/.buildx-cache \ | ||
| --cache-to=type=local,dest=/tmp/.buildx-cache" |
There was a problem hiding this comment.
This doesn't do anything, OPTIONAL_BUILD_FLAGS is not an option in make release-build which is causing the layers not to be caching properly
Once this is fixed you should test by triggering multiple builds (with different commit hashes) (for example by modifying the example app or adding a command to the Dockerfile) and showing that subsequent builds are much faster since most of the layers are already cached.
## Ticket n/a ## Changes Updates a documentation link that no longer resolves. https://www.rubydoc.info/github/hassox/warden gives a 404. The underlying https://github.com/hassox/warden redirects to https://github.com/wardencommunity/warden. Updated the documentation to use the latter's wiki (https://github.com/wardencommunity/warden/wiki) ## Testing navapbc/platform-test#196 (shows the CI check that flagged the broken link)
Co-authored-by: Loren Yu <loren@navapbc.com>
|
@lorenyu: your notes here make sense to me. Thanks for taking the time to walk through this. |
and adjust the inputs for the various cache actions: 1) i don't think we want to rely on restore-keys, and 2) we re-use the docker image name as the cache key (maybe?)
why: the dockle scan should suffice for testing the build job that's the subject of this branch
… to building and caching a docker image of the app. (running into issues with the buildx caching)
…t to find the docker image in the cache
…he docker image" technically, it is an option to delay checking out the repo, however we'd need to duplicate the logic from the Makefile around the expected image name. i think not worth it. This reverts commit 95a8a18.
needed in order to configure aws credentials
|
status: making progress, ran into what i think is a fixable bug and will revisit next week.
next steps:
|
in the step before calling the build-release-candidate action
the build triggered through the vulnerability workflow is still coming up with a different SHA from the build triggered through build-and-publish.
lorenyu
left a comment
There was a problem hiding this comment.
I realize you're still working on this, but i saw you left a comment so i decided to poke and left some comments. feel free to ignore if they are a distraction but hopefully they help
| echo "Is image published: $is_image_published" | ||
| echo "is_image_published=$is_image_published" >> "$GITHUB_OUTPUT" | ||
|
|
||
| - name: Build release |
There was a problem hiding this comment.
what's the reason for making this a separate job?
| name: Check whether the image is already published | ||
| runs-on: ubuntu-latest | ||
| needs: get-commit-hash | ||
| concurrency: build-and-publish-${{ inputs.app_name }}-${{ needs.get-commit-hash.outputs.commit_hash }} |
There was a problem hiding this comment.
this concurrency is important to keep. the concurrency statement later on that uses github.ref won't work since different github.refs can refer to the same commit hash (e.g. main, origin/main, HEAD, , ) can all be valid refs that point to the same commit hash, so there'd be a race condition where multiple jobs are trying to build the same commit hash but don't realize it since they are referencing the commit via different refs. that's why we have the separate job beforehand that gets the commit hash.
| - name: Restore cached Docker image | ||
| uses: actions/cache/restore@v4 | ||
| with: | ||
| path: /tmp/docker-image.tar | ||
| key: ${{ steps.build-release-candidate.outputs.image_cache_key }} | ||
| fail-on-cache-miss: true | ||
|
|
||
| - name: Load cached Docker image | ||
| run: | | ||
| docker load < /tmp/docker-image.tar |
There was a problem hiding this comment.
can we put these steps into actions/build-release-candidate
| uses: ./.github/workflows/vulnerability-scans.yml | ||
| with: | ||
| app_name: "app-rails" | ||
| ref: ${{ github.ref }} |
| with: | ||
| path: /tmp/docker-image.tar | ||
| key: ${{ steps.create-image-identifier.outputs.image }} | ||
| lookup-only: true |
There was a problem hiding this comment.
what's the reason for only doing a lookup? i think we'd want to download the image if it's already built
| - name: Cache Docker image | ||
| if: steps.check-image-already-exists.outputs.cache-hit != 'true' | ||
| uses: actions/cache/save@v4 | ||
| with: | ||
| path: /tmp/docker-image.tar | ||
| key: ${{ steps.create-image-identifier.outputs.image }} |
There was a problem hiding this comment.
this step shouldn't be needed. cache should save automatically when the job completes
| key: ${{ steps.create-image-identifier.outputs.image }} | ||
| lookup-only: true | ||
|
|
||
| - name: Build and tag Docker image for scanning |
There was a problem hiding this comment.
Not necessarily in scope for this PR if it is getting complex, but in an older version of this PR you had code that also cached the intermediate docker layers in /tmp/.buildx-cache which could dramatically speed up builds even when it's a cache miss since some of the intermediate layers will be cached
There was a problem hiding this comment.
Yes, you're right. I removed that from my branch because I was overwhelmed by all the other issues I was running into. At this point - I want to get the PR closed! - I think I won’t attempt to get that feature working. Should I propose it as a new issue? or add it as a comment to navapbc/template-infra#206 ?
There was a problem hiding this comment.
Sure go ahead and create a new issue and link it here
|
@lorenyu : I appreciate these notes - thank you. For what you’ve flagged, most of those were knowledge gaps on my part, or me thinking too narrowly around the vulnerability scans workflow. I’ll be following your notes to get this PR cleaned up. |
|
@lisac sounds good! happy to help if you run into any issues |
instead of having each of the caller workflows follow up with those steps
why: we're not gaining efficiency by running the lookup-only mode as its own step
…alling workflow see the image?
…e code and workflow
… remove the sleep
…the commit hash see #196 (comment) multiple github.refs may evaluate to the same commit hash
…rability-scans.yml
|
Closing without merging. While I like this PR's change to the vulnerability workflow stylistically, it doesn't meaningfully improve the developer experience. Consider that the jobs in the vulnerability workflow run in parallel, thus whether each job builds the docker image before executing a particular vulnerability scan (version in Let's first implement navapbc/template-infra#936, which is the more impactful aspect from Daphne's implementation. That will benefit the vulnerability workflow, and as a follow-on we could re-consider having the vulnerability workflow have a single job specific to building the image. |
Ticket
n/a. Implements a proposed improvement documented in workflow vulnerability-scans.yml, so that we avoid multiple jobs within the workflow building the app docker image.
Changes
Updates the vulnerability-scans.yml workflow so that the a docker image of the app is built once and cached for use as needed by the jobs for each of the vulnerability scans configured in that workflow, rather than having each of those jobs build the app docker image.
This build and cache logic is in a new composite action:
actions/build-release-candidate.This composite workflow is additionally applied to the build-and-publish.yml workflow. Whereas before, this workflow would run
make release-build, it now calls the new composite action.This PR is inspired by an implementation authored by @daphnegold . navapbc/template-infra#936 describes another element of their implementation that might be of interest for template-infra.
Context for reviewers
I am very regretful about not starting with a tech spec for this work!
Lessons were learned.
Testing
[TODO]
Preview environment for app
♻️ Environment destroyed ♻️
Preview environment for app-rails
♻️ Environment destroyed ♻️