Add Go instrumentation watcher for the ecosystem registry#665
Add Go instrumentation watcher for the ecosystem registry#665mikeblum wants to merge 19 commits into
Conversation
…ging, and build tooling
… into Library records
…gle watcher entrypoint
Walk the go-contrib subtrees that instrument a developer's code: the instrumentation wrappers (gin, grpc, http…) and bridges (zap, logrus…). The other components (exporters, propagators, samplers, detectors, processors) configure the SDK pipeline rather than instrument a target library, so they are out of scope for the instrumentation inventory.
…ntory and registry
❌ Deploy Preview for otel-ecosystem-explorer failed.
|
…ate inventory YAML Spans with no attributes (e.g. kind: internal in packages with no matching semconv) were emitted alongside spans that had attributes, producing inconsistent inventory entries. Filter them at extraction time. Adds TestExtractSpansFiltersEmptyAttributes (analyzer) and TestInventoryYAMLValidation (inventory) to catch regressions.
📦 Rendered output previewThe inventory files themselves are not in this PR's diff — they're produced by the nightly CI automation ( Generated from a local
Libraries (19)Scope is components that instrument a developer's code: instrumentation wrappers + log bridges. Pipeline-config components (exporters, propagators, samplers, detectors, processors) are intentionally excluded — see PR description.
Example record —
|
…eneration Adds a repeatable pipeline for posting watcher output previews to PRs: - cmd/snapshot/ reads the generated inventory, formats a sorted library table (bridges then wrappers, each alphabetical by display_name) and selects the richest telemetry example + first bridge as featured records - scripts/upsert-pr-comment.sh finds and patches the sentinel comment (<!-- otel-go-snapshot -->) or creates a new one if absent - make snapshot: go run ./cmd/snapshot/ | ./scripts/upsert-pr-comment.sh
|
To address the |
jaydeluca
left a comment
There was a problem hiding this comment.
i started reviewing this, but wanted to pause and ask if it's possible to start even smaller? it's a bit difficult to review 48 files and 8k lines of code
| ## Commits | ||
|
|
||
| This watcher lives in a shared monorepo with a shared Git log. Follow | ||
| [Conventional Commits](https://www.conventionalcommits.org/) and the scope | ||
| convention used across this repo: the scope is the component path, abbreviated | ||
| to the ecosystem and language. | ||
|
|
||
| - **Scope:** `ecosystem-automation/golang` — e.g. | ||
| `feat(ecosystem-automation/golang): resolve go-contrib release tags`. | ||
| - Use the standard types (`feat`, `fix`, `refactor`, `chore`, `docs`, `test`, | ||
| `style`, `perf`). | ||
| - No AI attribution, `Co-Authored-By`, or tool links in commit messages. | ||
| - Keep the log reviewable: one logical change per commit, not a play-by-play of | ||
| the editing session. |
There was a problem hiding this comment.
personally I prefer AI to not make commits at all, so let's remove this. People who use ai to make commits can put this type of thing in a personal agents file
There was a problem hiding this comment.
struck it. I saw AGENTS.md and CLAUDE.md in the repo and thought it was needed.
| ## Footguns | ||
|
|
||
| - The `insturmentation` typo in the module path and directory name is fixed — | ||
| the module is `.../golang-instrumentation-watcher`. Don't reintroduce it. | ||
| - Snapshot writes must clean up the prior snapshot and write a replacement in | ||
| the same run, or the frontend sees a missing snapshot until the next sync. | ||
| - Don't hand-edit files under `ecosystem-registry/` — they are pipeline output | ||
| and the immutable historical record. |
There was a problem hiding this comment.
im not sure we need this either
| ## Footguns | |
| - The `insturmentation` typo in the module path and directory name is fixed — | |
| the module is `.../golang-instrumentation-watcher`. Don't reintroduce it. | |
| - Snapshot writes must clean up the prior snapshot and write a replacement in | |
| the same run, or the frontend sees a missing snapshot until the next sync. | |
| - Don't hand-edit files under `ecosystem-registry/` — they are pipeline output | |
| and the immutable historical record. |
| rm -f coverage.out | ||
|
|
||
| .PHONY: test-perf | ||
| test-perf: ## ⚡ Run benchmark tests |
There was a problem hiding this comment.
there's a ton of stuff in this file that I don't think are necessary, like benchmarks and security scans. Can you just scope this to only what's needed?
There was a problem hiding this comment.
stripped this down to the basics
| A **Weaver registry** (`registry/signals.yaml` + `attributes.yaml`) is also | ||
| generated as an optional dev/validation artifact (`make dev` → | ||
| `weaver registry check`), not as the canonical consumer output. |
There was a problem hiding this comment.
i think we should start small and build up to this. can we strip this down to just whats needed/included in this first step?
There was a problem hiding this comment.
kicked the weaver registry work to another follow up branch
…on into feat/weaver-registry Remove all Weaver-specific code (generator, Group/AttributeRef/AttributeDef types, CalculateStats, convertTelemetryToGroups) so this branch is solely about walking go-contrib and emitting the versioned inventory. Weaver registry output lives on feat/weaver-registry branched from this HEAD.
struck all the weaver registry work and pushed to another branch. What's left is the |
95684ed to
0e6e697
Compare
There was a problem hiding this comment.
I was unsure about needing both go/contrib/vX.Y.Z and go/contrib/vX.Y.Z-SNAPSHOT. It get why as its what maven and gradle expect but it feels a bit awkward and redundant here. Is the idea we symlink instead to appease the crawler?
Summary
A Go instrumentation watcher for opentelemetry-go-contrib: clones the repo,
extracts metadata + telemetry via Go AST, and emits a versioned, content-addressed
inventory at
ecosystem-registry/go/contrib/v{version}/instrumentation.yaml.Scope is the components that instrument a developer's code — the 14
instrumentation wrappers (gin, grpc, http…) and 5 bridges (zap, logrus…), 19
libraries total. Pipeline-config components (exporters, propagators, samplers,
detectors, processors) are out of scope: they configure the SDK, not a target library.
Core responsibilities
Maps to all five in watchers-registry-consumers.md — see the watcher README's "Core Responsibilities" section for per-function detail:
mainSNAPSHOT, per-module versions,VersionExistsidempotencyfile_format: 0.1, metadata from each module's owngo.modmake syncper tag, idempotent releases, SNAPSHOT refreshedTest plan
go test ./...(unit + integration, incl. determinism + uniqueness tests)make sync→ v1.44.0 + SNAPSHOT, 19 libraries, byte-identical across two runsNotes
#3auto-issue creation is the one deferred responsibility (README roadmap).ecosystem-registry/go/files are produced by nightly CI, so not in this diff.