See VISION.md for goals, roadmap, and product surface.
In brief: A CLI for managing Grafana and Grafana Cloud. Supports dynamic Grafana API resources via a kubectl-like resources layer, and per-product features via the provider interface. Includes observability-as-code workflows (gcx dev), multi-stack configuration/contexts, and Grafana Assistant integration. Optimized for AI agents and human use.
The core of gcx. Manages Grafana-native resources (dashboards, folders, alert rules, etc.) with Grafana's Kubernetes-compatible /apis endpoint (available in Grafana 12 or later).
User input gcx resources push ./dashboards/
|
v
Selector (partial) "dashboards/" or "dashboards/my-dash"
|
v
Discovery Registry API call to /apis → available GVKs
|
v
Filter (resolved) Full GVK: dashboard.grafana.app/v1alpha1
|
v
Processors Strip server fields (pull) / add namespace (push)
|
v
Dynamic Client (k8s.io/client-go) Create-or-update via /apis endpoint
|
v
Grafana K8s API /apis/{group}/{version}/namespaces/{ns}/{plural}/{name}
Operations: get, push (create-or-update, idempotent), pull (export to local YAML/JSON), delete, edit (single resource, $EDITOR), validate (local linting via Rego), schemas (discover types), examples (show sample manifests).
Key abstractions (resource-model.md): Resource wraps unstructured.Unstructured — no pre-generated Go types. Selector → Filter two-stage resolution keeps CLI ignorant of API details. Processor pipeline composes transformations at defined pipeline points. Discovery registry resolves plural names and short names to full GVKs at runtime.
Data flows (data-flows.md): Push reads local files, resolves selectors, applies processors, pushes via dynamic client with folder-before-dashboard ordering and bounded concurrency (errgroup, default 10). Pull fetches from API, strips server-managed fields, writes to disk grouped by kind.
Pluggable adapters for Grafana Cloud products. Each provider is a self-contained package under internal/providers/ that contributes CLI commands and optionally bridges into the resources pipeline.
Provider (internal/providers/slo/)
|
+-- Commands() Cobra commands: gcx slo definitions list
|
+-- TypedRegistrations() Adapter registrations for resources pipeline
| |
| v
| adapter.Register() Makes provider resources accessible via gcx resources get/push/pull
|
+-- ConfigKeys() Declares provider-specific config keys (token, url, ...)
|
+-- Validate() Validates config before API calls
TypedCRUD[T] bridges typed Go domain structs to K8s-style unstructured.Unstructured envelopes. Domain types implement ResourceIdentity (GetResourceName/SetResourceName). TypedObject[T] wraps them with ObjectMeta + TypeMeta for K8s compliance.
ConfigLoader (providers.ConfigLoader) handles --config/--context flag binding, YAML + env var precedence, and provider-specific config resolution (GRAFANA_PROVIDER_{NAME}_{KEY}). All providers must use it — no ad-hoc os.Getenv.
Dual access paths are permanent: provider commands (gcx slo definitions list) give ergonomic domain-specific tables; generic commands (gcx resources get slos.v1alpha1.slo.ext.grafana.app) serve the push/pull pipeline. JSON/YAML output is identical across both paths by construction (both use the same ResourceAdapter).
Deep-dive: patterns.md §11 (Provider Plugin System), §17 (K8s Envelope Wrapping), §18 (Table-Driven TypedCRUD), §19 (Singleton Adapter), §20 (ETag-as-Annotation). Implementation guide: provider-guide.md.
Top-level commands for querying observability datasources: metrics, logs, traces, profiles. These bypass the K8s dynamic client and call datasource HTTP APIs directly.
gcx metrics query -d prom-001 'rate(http_requests_total[5m])' --since 1h
|
v
SharedOpts Shared flags: -d/--datasource, --from, --to, --since, --step
|
v
Datasource Resolution Resolves -d flag to datasource UID (by name, UID, or config default)
|
v
Query Client internal/query/prometheus/ or internal/query/loki/ (direct HTTP)
|
v
Codec Pipeline table (default) | graph (terminal chart) | json | yaml
Standardized verbs: query (execute queries), labels (list label names/values), series/metrics (list series or compute metric queries), metadata (metric metadata). All four signal providers share these verbs with identical flag semantics.
Adaptive telemetry nests under each signal provider (metrics adaptive, logs adaptive, traces adaptive) with its own CRUD resources (rules, policies, exemptions, segments) and operational views (recommendations, patterns). Uses internal/auth/adaptive/ for shared GCOM-cached Basic auth.
Graph rendering: internal/graph/ converts query responses to terminal charts via ntcharts + lipgloss. Available as -o graph on all query commands and SLO/synth timeline commands.
Observability-as-code workflows for managing Grafana resources as typed Go code via grafana-foundation-sdk. The gcx dev commands produce and validate resources that feed into the standard gcx resources pipeline.
End-to-end workflow: scaffold → import/add → edit Go code → serve/lint → build to manifests → resources push
scaffold— Generate a new project (Go module + foundation-sdk + folder structure)import— Import existing dashboards/alerts from Grafana as Go builder codeserve— Live-reload dev server (Chi router, reverse proxy, WebSocket reload) — edit code, preview in browserlint— Lint resources with built-in and custom Rego rules (OPA engine ininternal/linter/), including PromQL/LogQL expression validatorsgenerate— Code generation utilities
The linter engine is also used by gcx resources validate for pre-push validation. See VISION.md § Observability as Code for the full workflow vision.
Onboarding and declarative product configuration. Not a provider — standalone command area.
setup status— Check connection, auth, and product availabilitysetup instrumentation discover— Discover instrumentable workloads via Fleet Managementsetup instrumentation show/apply— View and apply instrumentation configs with optimistic lock comparison
Uses internal/fleet/ (shared fleet base client) and internal/setup/instrumentation/ (manifest types, instrumentation client). The fleet base client is shared between the setup system and the fleet provider.
kubectl-inspired context-based multi-environment configuration.
current-context: prod
contexts:
prod:
grafana: { server: https://grafana.example.com, token: gf_... }
cloud: { token: glsa_..., org: my-org }
providers:
slo: { token: glsa_... }
synth: { sm-url: https://... }Loading chain: Config file → env var overrides (GRAFANA_SERVER, GRAFANA_TOKEN, GRAFANA_PROVIDER_{NAME}_{KEY}) → CLI flags (--context). Env vars take precedence over YAML. The --context flag selects the active context; absent, current-context is used.
Namespace resolution: org-id (on-prem, maps to K8s namespace) or stack-id (Cloud, discovered via GCOM). Providers use ConfigLoader which resolves these uniformly.
Secret handling: Config keys marked Secret: true in provider ConfigKeys() are redacted in gcx config view. Undeclared keys and unknown providers are redacted by default (secure-by-default).
Deep-dive: config-system.md.
Multiple auth mechanisms for different tiers.
| Mechanism | Used for | Implementation |
|---|---|---|
| Service account token | Grafana K8s API (/apis), plugin APIs |
Bearer token in rest.Config |
| Cloud Access Policy token | GCOM stack discovery, Cloud product APIs | internal/cloud/ GCOM client |
| OAuth PKCE | Browser-based login (gcx auth login) |
internal/auth/ — token refresh transport persists to config |
| Basic auth | Legacy Grafana instances | Username/password in rest.Config |
| Adaptive auth | Signal provider adaptive telemetry APIs | internal/auth/adaptive/ — GCOM-cached Basic auth shared across signal providers |
Precedence: Token > OAuth > user/password. Explicit flags override env vars override config file. ExternalHTTPClient() must be used for APIs outside the Grafana server (K6 Cloud, OnCall, Synth, Fleet) — the k8s transport injects the Grafana bearer token on every request, which conflicts with product-specific auth.
Deep-dive: client-api-layer.md, config-system.md.
| ADR | Title | Status |
|---|---|---|
| 001 | Move query under datasources with per-kind subcommands | accepted |
| 002 | Align resources examples with resources schemas UX |
accepted |
| 003 | CloudConfig in Context and GCOM Stack Discovery | accepted |
| 004 | Multi-File Config Layering (System/User/Local) | accepted |
| 005 | Codify CLI Design Principles in CONSTITUTION.md and Design Guide | accepted |
| 006 | Conventional Commits via PR Title Enforcement | accepted |
| 007 | Provider Consolidation Strategy | accepted |
| 008 | TypedResourceAdapter[T] with ResourceIdentity and Provider Command Migration | proposed |
| 009 | Three-Stage Skill Structure with Dual Blackbox Isolation | superseded by [012] |
| 010 | Table-driven TypedCRUD[T] for OnCall Adapter | proposed |
| 011 | Adaptive telemetry provider: CLI UX, adapter scope, verb naming | proposed |
| 012 | Five-phase pipeline redesign for /migrate-provider | accepted |
| 013 | App O11y provider: singleton TypedCRUD, ETag-as-annotation, verb naming | accepted |
| 014 | Declarative Instrumentation Setup under gcx setup |
proposed |
| 015 | Faro provider: CLI UX, TypedCRUD adapter, sourcemaps as sub-resource verbs | proposed |
See docs/adrs/ for all ADRs.
Deep-dive docs live in docs/architecture/. Each covers one domain:
| Document | Domain | When to Read |
|---|---|---|
| architecture.md | Full system architecture with diagrams | First-time orientation |
| patterns.md | Recurring patterns catalog | Before implementing new features |
| resource-model.md | Resource, Selector, Filter, Discovery | Modifying resource handling |
| cli-layer.md | Command tree, Options pattern, lifecycle | Adding/modifying CLI commands |
| client-api-layer.md | Dynamic client, auth, error translation | API communication changes |
| config-system.md | Contexts, env vars, TLS, namespace resolution | Config or auth changes |
| data-flows.md | Push/Pull/Serve/Delete pipelines | Modifying resource sync |
| project-structure.md | Build system, CI/CD, dependencies | Build issues, adding deps |
See also: docs/design/ for UX implementation guides, docs/reference/ for provider guides and CLI reference.
- Starting a new feature: Read
architecture.md→patterns.md→ relevant domain doc - Fixing a bug: Jump directly to the relevant domain doc
- Adding a CLI command: Read
cli-layer.mdfirst, thenpatterns.md - Understanding a data flow: Read
data-flows.md - Adding config fields or auth: Read
config-system.md - Modifying resource handling: Read
resource-model.md - API communication or errors: Read
client-api-layer.md - Build issues or dependencies: Read
project-structure.md
How does a resource get pushed to Grafana?
- data-flows.md § "PUSH Pipeline" — numbered steps (parse selectors → resolve → read → push → summary)
- resource-model.md — Selector/Filter concepts
- client-api-layer.md — how Create/Update calls work
Adding a new CLI flag to push:
- cli-layer.md § "The Options Pattern"
- Look at
push.goas the canonical example - Add to opts struct → bind in
setup()→ validate inValidate()
Adding support for a new resource type:
- resource-model.md § "Discovery System" — types are discovered at runtime, no hardcoding
- patterns.md § "Processor Pipeline" — if custom handling is needed
- data-flows.md — where processors are applied
Adding a new provider:
- provider-guide.md — step-by-step implementation guide
- patterns.md § "Provider Plugin System" — interface, registration, TypedCRUD
- provider-checklist.md — UX compliance checklist
Debugging an authentication issue:
- config-system.md § "Auth Priority" — token vs user/password precedence
- client-api-layer.md — how auth wires into
rest.Config - config-system.md — env var override behavior
Enforced — see CONSTITUTION.md § Taste Rules for the authoritative list.
- Options pattern for every command:
optsstruct →setup(flags)→Validate()→ constructor - Error messages: lowercase, no trailing punctuation
- Table-driven tests: all Go tests follow Go wiki conventions
- errgroup concurrency: bounded parallelism (default 10) for all batch I/O operations
- Commit format: Title (one-liner) / What (description) / Why (rationale)
- VISION.md — goals, product surface, roadmap themes
- CONSTITUTION.md — architecture invariants and dependency rules
- DESIGN.md — CLI UX design, command grammar, output model
- docs/reference/provider-guide.md — how to add a new provider