Standalone Go library + CLI that annotates browser screenshots with numbered badges (@1, @2...) over interactive elements via CDP. Public open-source repo, drives adoption of the wider VulpineOS ecosystem.
Repo: VulpineOS/vulpine-mark (public, MIT)
pkg/vulpinemark/— librarycdp.go— minimal gorilla/websocket CDP client,/json/listpage-target auto-discoveryenumerate.go—Runtime.evaluateJS that finds visible interactive elements + accessible namesscreenshot.go—Page.captureScreenshot+Page.getLayoutMetrics+ DPR probeannotate.go— Goimage/draw+x/image/font/basicfontfor borders + numbered badgesmark.go— top-levelMark.Annotate()APIactions.go— click/type/hover helpers driven by labelscluster.go— cluster mode: group repeated items under@N[K]labelsdiff.go— diff mode: annotate only what changed between two snapshotspalette.go— palette packs (default / high-contrast / monochrome / colorblind)svg.go— SVG overlay output (AnnotateSVG)stable_labels.go— stable semantic-hash label IDsheatmap.go— heatmap mode (AnnotateHeatmap, translucent importance fills)json_output.go— JSON-only mode (AnnotateJSON, no screenshot capture)filter.go— customElementFiltercallbacks +IncludeRoles/ExcludeRoles
cmd/vulpine-mark/— CLI binary
go build ./...
go vet ./...
go test ./...- Push only to
VulpineOS/vulpine-mark. Never touchCloverLabsAIor any other org. - One-line commit messages, no co-authors, push after every cohesive change.
- This repo is public — no proprietary VulpineOS internals here. Don't reference any private VulpineOS internals or private patches in code or docs.
- The native Juggler implementation lives in VulpineOS itself (private). This repo is the standalone CDP version only.
- Autonomous /loop mode: never ask permission, act and document in commits.
- Linear is the shared execution tracker for the VulpineOS ecosystem. Use the
VulpineOSworkspace, product/type/source labels, and link commits in issue comments when closing work. - Codex has a persistent local Playwright MCP at
http://localhost:8931/mcpfor browser navigation, snapshots, console/network inspection, and screenshots. It writes artifacts to~/.codex/mcp-output/playwrightand omits inline image payloads to reduce token usage. - For visual/browser verification, prefer saved snapshots and screenshot filenames over pasting large page dumps or image data into chat.
- Real-page integration test (fake CDP transport, gated on
-tags integration) - Unit tests for
enumerateJS (selector snapshot + fixture-response decode) - Full-page mode (scroll + stitch screenshots, label off-viewport elements)
- DPR scaling fix for Retina screenshots (
viewportSizenow returnsvisualViewport.scale * devicePixelRatio) - Element visibility: occlusion check (elementFromPoint at center)
- Click-by-label helper:
mark.Click(ctx, "@3")dispatches mouse event at element center - Type-by-label helper:
mark.Type(ctx, "@5", "hello") - Scroll-into-view before action (reuses viewport metrics)
- Context-aware action helpers (all methods take
ctx context.Context) - Cluster mode: group repeated items under
@N[K]labels - Diff mode: annotate only what changed between two snapshots
- Per-label confidence score + low-confidence fade
- Output formats: SVG overlay (
AnnotateSVG,--svg) + JSON-only mode (AnnotateJSON,--json-only); base64 stdout still TODO - Palette packs: default / high-contrast / monochrome / colorblind (
SetPalette,--palette) - Stable semantic-hash labels (
UseStableLabels) - Heatmap mode: translucent importance-weighted fills (
AnnotateHeatmap,--heatmap) - Custom element filter callbacks (
SetElementFilter,--include-role,--exclude-role) - Real-page flow test (always-on, covers annotate + click + type + hover dispatch)
- CLI:
--max-elements,--clustered,--diff,--save-result(selectors still TODO) - Doc: example annotated PNG in README
- GitHub Actions CI (build, vet, test on linux/macos)