✨ github/gitlab: discover CloudFormation, Dockerfile, Bicep, Helm, and Kustomize IaC#8314
Open
tas50 wants to merge 1 commit into
Open
✨ github/gitlab: discover CloudFormation, Dockerfile, Bicep, Helm, and Kustomize IaC#8314tas50 wants to merge 1 commit into
tas50 wants to merge 1 commit into
Conversation
7c5ca77 to
f93539d
Compare
…d Kustomize IaC Extend the GitHub and GitLab providers' infrastructure-as-code discovery beyond Terraform and Kubernetes manifests to also detect CloudFormation templates, Dockerfiles, Bicep files, Helm charts, and Kustomize configurations in repositories. Discovery (github + gitlab): - New targets: cloudformation, dockerfiles, bicep, helm, kustomize — wired into org()/repo(), the "all" expansion, and the CLI config. - Both providers classify the filename/extension-based types (k8s, dockerfiles, bicep, helm, kustomize) from a single recursive git-tree walk per repo rather than one search call per type. On GitHub the tree endpoint is on the generous core rate limit, so `--discover all` no longer risks the Code Search limit (30 req/min) on multi-repo orgs; the walk also catches *.Dockerfile names the old filename: prefix search missed. - CloudFormation is detected via a content heuristic (the AWSTemplateFormatVersion marker) since templates share .yaml/.json with many other files — one Code Search (GitHub) / project blob search (GitLab), one child asset per template. GitLab skips it gracefully on instances without advanced search. - Dockerfiles: one child asset per Dockerfile / Dockerfile.* / *.Dockerfile. - Bicep: one child asset per repo (the connection walks the checkout). - Helm: one asset per chart directory (Chart.yaml). - Kustomize: one asset per kustomization directory (kustomization.yaml/.yml / Kustomization), so base and overlays each become their own asset. - Deduped the credential-cloning into a gitCredentials helper. Git-clone support (downstream providers): - None of bicep, cloudformation, helm, kustomize, or the os dockerfile connection could clone before. Each now clones on http-url, resolves a repo-relative path/dir within the checkout where applicable, and cleans up its temp dir via Close(). A deferred cleanup guard is armed right after the clone and disarmed once the connection owns the closer, so every error path removes the checkout. Repo-based naming & stable identity: - k8s, bicep, cloudformation, helm, and kustomize now name discovered assets from the git repo (org/repo[/path]) like Terraform/Dockerfile, instead of the temporary clone directory. - Platform IDs and asset.Id are derived from the repo (and template/dir path) rather than a hash of the temp clone path, which previously changed on every scan. The bicep connection no longer overwrites cc.Options["path"] with the non-deterministic clone directory. - Asset names dropped the verbose "Static Analysis" verbiage (e.g. "CloudFormation template tas50/iac_tests/...", "Helm Chart tas50/..."). Note: Chart.yaml and kustomization.yaml match their own discovery cases ahead of the k8s YAML branch, so a repo whose only YAML files are Helm/Kustomize entry points no longer also registers as a k8s manifest asset. This is intentional — those files aren't k8s manifests — and now behaves identically on both providers. Verified end-to-end against tas50/iac_tests: discovery, naming, and content queries resolve for all IaC types. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f93539d to
c548377
Compare
Contributor
Test Results9 864 tests 9 858 ✅ 3m 6s ⏱️ Results for commit c548377. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Extends the github provider's infrastructure-as-code discovery beyond Terraform and Kubernetes manifests to also detect CloudFormation templates, Dockerfiles, Bicep files, Helm charts, and Kustomize configurations in repositories. Each match becomes a child asset that is cloned from git and scanned by the relevant provider.
Files discovered
Discovery uses the GitHub code-search API (paginated, 100/page) and skips hidden paths (anything with a
.-prefixed segment, e.g..github/).terraformHCLlanguagek8s-manifests*.yaml/*.yml(excl.*mql.yaml/*mql.yml)cloudformation.yaml/.yml/.json/.templatecontaining theAWSTemplateFormatVersionmarker (avoids false positives against k8s/other YAML)dockerfilesDockerfile,Dockerfile.*(e.g.Dockerfile.prod),*.Dockerfile,*.dockerfilebicep*.bicephelmChart.yamlkustomizekustomization.yaml/kustomization.yml/KustomizationLike the existing Terraform/k8s detectors, the new ones run only on explicit
--discover allor--discover <type>(not--discover auto), so there's no surprise cloning.Git-clone support added to downstream providers
None of bicep, cloudformation, helm, kustomize, or the os dockerfile connection could clone from a git URL before — the dockerfile connection only read
ssh-urlfor naming. Each now:http-url(shallow, via the sharedplugin.NewGitClone),Close(), including on every post-clone error path (deferred cleanup, disarmed once the connection takes ownership).Trade-off: CloudFormation, Dockerfile, Helm, and Kustomize emit one asset per matched file/directory, so a repo with N of them performs N shallow clones. This keeps each connection's existing single-target model and avoids a shared clone cache; documented in the discoverers.
Note on overlap: Helm
templates/*.yamlandkustomization.yamlalso match the existing k8s*.yamldetector, so a chart/kustomize repo can surface under--discover k8s-manifestsand--discover helm/kustomize. That's intentional — each is a distinct analysis lens — but worth being aware of when running--discover all.Repo-based naming & stable identity
org/repo[/path]) like Terraform/Dockerfile, instead of the temporary clone directory. FixesK8s Manifest directory mql-git-clone3841…→K8s Manifest tas50/iac_tests.Verification
Verified end-to-end against
tas50/iac_tests. Discovery, naming, and content queries all resolve:K8s Manifest tas50/iac_testsk8s-manifestCloudFormation template …/cloudformation/s3_bucket.yamlcloudformationDockerfile …/docker/Dockerfile.securedockerfileDockerfile …/docker/Dockerfile.insecuredockerfileBicep file tas50/iac_testsbicep.bicepfiles, 1 resource eachHelm Chart …/helm/webhelmweb0.1.0 parsedKustomize file …/kustomize/basekustomizeKustomize file …/kustomize/overlays/prodkustomizeFiles changed
providers/github/{config,connection,resources}— five discovery targets + detectors + paginatedsearchCodehelper +gitCredentialsdedupeproviders/bicep/{provider,connection},providers/cloudformation/{provider,connection},providers/helm/{provider,connection},providers/kustomize/{provider,connection}— git clone,Close(), repo-based name/platform IDproviders/os/connection/docker/docker_file_connection.go— git clone,Close(), repo-relative platform IDproviders/k8s/connection/manifest/connection.go— repo-based asset name