diff --git a/docs/README.md b/docs/README.md
index d2304e777..4eec7a79f 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -11,17 +11,18 @@ Godocs on exported library package code (such as `resource`, `operator`, `plugin
### Table of Contents
-| Document | Description |
-|-------------------------------------------------------|-------------|
-| [Application Design](./application-design/README.md) | The typical design patterns of an app built with the SDK |
-| [Custom Kinds](./custom-kinds/README.md) | What are kinds, how to write them, and how to use them |
-| [Resource Objects](./resource-objects.md) | Describes the function and usage of the `resource.Object` interface |
-| [Resource Stores](./resource-stores.md) | Describes the various "Store" types in the `resource` package, and why you may want to use one or another |
-| [Operators & Event-Based Design](./operators.md) | A brief primer on what operators/controllers are and working with event-based code |
-| [Code Generation](./code-generation.md) | How to use CUE and the CLI for code generation. |
-| [Local Dev Environment Setup](./local-development.md) | How to use the CLI to set up a local development & testing environment |
-| [Kubernetes Concepts](./kubernetes.md) | A primer on some kubernetes concepts which are relevant to using the SDK backed by a kubernetes API server |
-| [Admission Control](./admission-control.md) | How to set up admission control on your kinds for an API server |
+| Document | Description |
+| ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------- |
+| [Application Design](./application-design/README.md) | The typical design patterns of an app built with the SDK |
+| [Custom Kinds](./custom-kinds/README.md) | What are kinds, how to write them, and how to use them |
+| [Resource Objects](./resource-objects.md) | Describes the function and usage of the `resource.Object` interface |
+| [Resource Stores](./resource-stores.md) | Describes the various "Store" types in the `resource` package, and why you may want to use one or another |
+| [Operators & Event-Based Design](./operators.md) | A brief primer on what operators/controllers are and working with event-based code |
+| [Code Generation](./code-generation.md) | How to use CUE and the CLI for code generation. |
+| [Local Dev Environment Setup](./local-development.md) | How to use the CLI to set up a local development & testing environment |
+| [Kubernetes Concepts](./kubernetes.md) | A primer on some kubernetes concepts which are relevant to using the SDK backed by a kubernetes API server |
+| [Admission Control](./admission-control.md) | How to set up admission control on your kinds for an API server |
+| [Validation Patterns](./architecture/validation-patterns.md) | Convention for reporting actionable runtime validation errors in resource status |
## Base Concepts of the SDK
diff --git a/docs/admission-control.md b/docs/admission-control.md
index acacf8214..eca563379 100644
--- a/docs/admission-control.md
+++ b/docs/admission-control.md
@@ -4,6 +4,10 @@ While an app can do some level of admission control by restricting access to the
the best practice is to implement admission control at the API server level itself. This is done by exposing webhooks from the
multi-tenant operator to validate and/or mutate requests to the API server.
+> **Note**: Admission control is used for **static validation** (format, structure, type checks, static business rules). For **runtime validation**
+> that depends on external systems or dynamic state, use [`fieldErrors` in status](./architecture/validation-patterns.md) instead.
+> See [Validation Patterns](./architecture/validation-patterns.md) for details on when to use each approach.
+
The `resource` package contains two interfaces used for admission control:
* [ValidatingAdmissionController](https://pkg.go.dev/github.com/grafana/grafana-app-sdk/resource#ValidatingAdmissionController), which is used to _validate_ incoming requests, returning a yes/no on whether the request should be allowed to proceed, and
* [MutatingAdmissionController](https://pkg.go.dev/github.com/grafana/grafana-app-sdk/resource#MutatingAdmissionController), which is used to _alter_ an incoming request, returning am altered object which is translated into series of patch operations which will be made by the API server before proceeding
@@ -102,4 +106,18 @@ mounted in `/run/secrets/tls`.
For production use, you can either re-use the configs and secrets created by the local environment (they are self-signed, but do not need a real CA as
they are only used for communication between the API server and the webhook server), or generate new ones. Keep in mind that every time you generate
-the local environment, the cert bundle is generated (and is unique each time), so don't rely on it being consistent.
\ No newline at end of file
+the local environment, the cert bundle is generated (and is unique each time), so don't rely on it being consistent.
+
+## Admission Control vs Runtime Validation
+
+Admission control handles **static validation** that can be checked synchronously before a resource is persisted:
+- Format validation (URL format, type checks, required fields)
+- Structure validation (enum values, data types, JSON structure)
+- Static business rules (naming conventions, reserved names)
+
+For **runtime validation** that depends on external systems, internal state, or dynamic conditions, use [`fieldErrors` in status](./architecture/validation-patterns.md):
+- External system checks (repository exists, branch exists, installation ID valid)
+- Internal runtime state (resource conflicts, service availability)
+- Dynamic state that can change over time (credentials expired, external resource deleted)
+
+See [Validation Patterns](./architecture/validation-patterns.md) for the complete convention and implementation guide.
\ No newline at end of file
diff --git a/docs/architecture/reconciliation.md b/docs/architecture/reconciliation.md
index 1db642400..b9959cd9f 100644
--- a/docs/architecture/reconciliation.md
+++ b/docs/architecture/reconciliation.md
@@ -2,6 +2,9 @@
This document provides detailed visual representations of the reconciliation code paths in the grafana-app-sdk, tracing the complete flow from application startup through event processing.
+> **Related**: When implementing reconciliation logic that performs runtime validation (e.g., checking external systems, validating dynamic state),
+> populate `fieldErrors` in the resource status. See [Validation Patterns](./validation-patterns.md) for the recommended convention.
+
## Overview
The reconciliation system in grafana-app-sdk follows a Kubernetes-inspired operator pattern with two primary flows:
diff --git a/docs/architecture/validation-patterns.md b/docs/architecture/validation-patterns.md
new file mode 100644
index 000000000..76440f819
--- /dev/null
+++ b/docs/architecture/validation-patterns.md
@@ -0,0 +1,959 @@
+# Validation Patterns: Field Errors in Status and Runtime Validation
+
+## Overview
+
+This document establishes the convention for reporting actionable validation and runtime errors in Kubernetes-style resource APIs. Instead of using separate custom validation endpoints (such as `/test`, `/validate`, `/check`, etc.), errors that depend on external systems, internal runtime state, or dynamic conditions should be reported in the resource's `status` field using `fieldErrors`. These errors provide actionable guidance to users, indicating what field values need to be changed (e.g., use a different branch, repository, installation ID) to resolve validation issues.
+
+## Convention
+
+**All resources that require validation or runtime error reporting MUST expose errors via `status.fieldErrors` rather than separate custom validation endpoints.**
+
+### Status Structure
+
+Resources should include a `fieldErrors` array in their status:
+
+```go
+type ResourceStatus struct {
+ // ... other status fields ...
+
+ // FieldErrors contains validation and runtime errors for specific fields.
+ // These errors are automatically populated by controllers during reconciliation.
+ FieldErrors []ErrorDetails `json:"fieldErrors,omitempty"`
+}
+
+type ErrorDetails struct {
+ // Type is a machine-readable description of the cause of the error.
+ // This matches Kubernetes' CauseType values (e.g., FieldValueInvalid, FieldValueRequired).
+ Type metav1.CauseType `json:"type"`
+
+ // Field is the JSON path to the field that caused the error.
+ // Examples: "spec.github.branch", "spec.installationID", "secure.token"
+ Field string `json:"field,omitempty"`
+
+ // Detail provides a human-readable explanation of what went wrong.
+ // This should be actionable - guide users to change the value to something valid.
+ // Examples: "branch not found", "repository not found", "installation ID invalid"
+ Detail string `json:"detail,omitempty"`
+
+ // Origin indicates where the error originated (optional).
+ // Can reference a specific validator, service, or rule.
+ Origin string `json:"origin,omitempty"`
+}
+```
+
+## When to Use This Pattern
+
+**Key Principle**: `fieldErrors` in status should be used for validation errors that **depend on external systems** or **dynamic state that could change over time**. Format validation and hard rules should be handled by **admission validators**.
+
+**Actionable Guidance**: `fieldErrors` are intended to provide **actionable guidance** to users. They should indicate what needs to be changed (e.g., use a different branch, repository, installation ID) to resolve the error. The errors guide users to update field values to valid alternatives.
+
+### ✅ Use `fieldErrors` in Status When:
+
+**Key Principle**: Use `fieldErrors` for validation errors that depend on **external systems, internal runtime state, or dynamic conditions** that could change over time. These are errors that cannot be determined at admission time and require runtime checks.
+
+1. **External system validation**: Errors from external services or APIs
+ - Example: GitHub API returns "installation not found", Git repository doesn't exist, external service unavailable
+ - These require actual API calls to external systems that may not be available during admission
+ - The external system state can change independently of your resource
+ - **Key indicator**: Requires calling an external API or service
+ - **User action**: Guide users to change the value (e.g., use a different installation ID, repository URL, or branch name)
+
+2. **Internal runtime validation**: Errors that depend on internal system state or runtime conditions
+ - Example: Resource conflicts detected at runtime, internal service unavailable, state-dependent validation failures
+ - These require checking internal system state that may change or may not be available during admission
+ - Internal state can change after resource creation
+ - **Key indicator**: Requires checking internal runtime state or conditions
+ - **User action**: Guide users to change the value or resolve the conflict (e.g., use a different resource name, wait for service availability)
+
+3. **Dynamic state validation**: Errors that depend on current system state that could change
+ - Example: Branch exists now but may be deleted later, connection credentials expired, resource was deleted externally
+ - State can change after resource creation, making admission-time validation insufficient
+ - These are "could stop existing" scenarios
+
+4. **Runtime connectivity/authentication errors**: Errors that occur during actual operations
+ - Example: Authentication failures, network connectivity issues, service unavailable
+ - These cannot be detected until the controller attempts to use the resource
+ - May be transient or permanent depending on external factors
+
+5. **Resource lifecycle errors**: Errors that occur during ongoing resource operations
+ - Example: Sync failures, health check failures, operational errors
+ - These happen continuously during the resource lifecycle, not just at creation time
+
+### ❌ Don't Use `fieldErrors` For:
+
+**Key Principle**: Use admission validators for **static rules** and **format validation** that can be checked without external dependencies.
+
+1. **Format and structure validation**: Use admission webhooks/validators instead
+ - Example: Required fields missing, invalid enum values, type mismatches, malformed URLs, invalid JSON structure
+ - These are hard rules that don't depend on external systems
+ - Can and should be caught before the resource is created/updated
+ - Examples:
+ - `spec.github.branch` is required (required field)
+ - `spec.type` must be one of: "github", "git", "local" (enum validation)
+ - `spec.github.url` must be a valid URL format (format validation)
+ - `spec.sync.intervalSeconds` must be a positive integer (type/range validation)
+
+2. **Syntactic validation**: Use OpenAPI schema validation instead
+ - Example: Invalid JSON structure, wrong data types, missing required properties
+ - These are handled by the API server's schema validation before reaching admission or controllers
+
+3. **Business rule validation**: Use admission webhooks instead
+ - Example: Resource name conflicts, reserved names, naming conventions
+ - These are static rules that don't require external system checks
+
+### Decision Tree
+
+```
+Is the validation error about:
+├─ Format/structure/type? → Use Admission Validator
+├─ Static business rules? → Use Admission Validator
+├─ External system state? → Use fieldErrors in Status ✅
+├─ Internal runtime state? → Use fieldErrors in Status ✅
+├─ Dynamic state that could change? → Use fieldErrors in Status ✅
+└─ Runtime connectivity/auth? → Use fieldErrors in Status ✅
+```
+
+### Examples: Admission Validator vs fieldErrors
+
+**Admission Validator (Static Rules)**:
+```go
+// ❌ Wrong: Don't use fieldErrors for format validation
+if !isValidURL(spec.github.url) {
+ // This should be caught in admission, not status
+}
+
+// ✅ Correct: Use admission validator
+func Validate(repo *Repository) field.ErrorList {
+ if repo.Spec.GitHub.URL == "" {
+ return field.ErrorList{
+ field.Required(field.NewPath("spec", "github", "url"), "URL is required"),
+ }
+ }
+ if !isValidURLFormat(repo.Spec.GitHub.URL) {
+ return field.ErrorList{
+ field.Invalid(field.NewPath("spec", "github", "url"), repo.Spec.GitHub.URL, "Invalid URL format"),
+ }
+ }
+ return nil
+}
+```
+
+**fieldErrors in Status (Runtime Validation)**:
+```go
+// ✅ Correct: Use fieldErrors for runtime validation (external or internal)
+func (c *Controller) reconcile(ctx context.Context, repo *Repository) error {
+ var fieldErrors []ErrorDetails
+
+ // Runtime validation checks - these require actual operations, not just spec validation
+
+ // Example 1: Check if repository exists (external system check)
+ exists, err := c.gitClient.RepoExists(ctx, repo.Spec.GitHub.URL)
+ if err != nil {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.github.url",
+ Detail: fmt.Sprintf("failed to check if repository exists: %v", err),
+ })
+ } else if !exists {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.github.url",
+ Detail: "repository not found", // Actionable: user should check/update URL
+ })
+ }
+
+ // Example 2: Check if branch exists (external system check)
+ if exists {
+ branchRef := fmt.Sprintf("refs/heads/%s", repo.Spec.GitHub.Branch)
+ _, err := c.gitClient.GetRef(ctx, branchRef)
+ if err != nil {
+ if errors.Is(err, nanogit.ErrObjectNotFound) {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.github.branch",
+ Detail: "branch not found", // Actionable: user should change to a valid branch
+ })
+ } else {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.github.branch",
+ Detail: fmt.Sprintf("failed to check if branch exists: %v", err),
+ })
+ }
+ }
+ }
+
+ // Example 3: Check authentication (runtime connectivity check)
+ authorized, err := c.gitClient.IsAuthorized(ctx)
+ if err != nil || !authorized {
+ detail := "not authorized"
+ if err != nil {
+ detail = fmt.Sprintf("failed check if authorized: %v", err)
+ }
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "secure.token",
+ Detail: detail, // Actionable: user should update credentials
+ })
+ }
+
+ // Example 4: Check for resource name conflicts (internal runtime state check)
+ existing, err := c.resourceLister.Get(repo.Name)
+ if err == nil && existing != nil && existing.UID != repo.UID {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueDuplicate,
+ Field: "metadata.name",
+ Detail: fmt.Sprintf("resource with name '%s' already exists", repo.Name),
+ // Actionable: user should use a different name
+ })
+ }
+
+ // Update fieldErrors in status
+ patchOps := []map[string]interface{}{
+ {
+ "op": "replace",
+ "path": "/status/fieldErrors",
+ "value": fieldErrors,
+ },
+ }
+
+ return c.statusPatcher.Patch(ctx, repo, patchOps...)
+}
+```
+
+### Writing Actionable Error Messages
+
+`fieldErrors` are intended to provide **actionable guidance** to users. They should guide users to change field values to valid alternatives (e.g., use a different branch, repository, installation ID).
+
+**Good Examples** (Actionable):
+- `"branch not found"` → User knows to change the branch name to an existing branch
+- `"repository not found"` → User knows to check/update the repository URL
+- `"installation ID invalid"` → User knows to update the installation ID to a valid one
+- `"authentication failed"` → User knows to update credentials
+
+**Bad Examples** (Not Actionable):
+- `"error occurred"` → Too vague, doesn't guide action
+- `"validation failed"` → Doesn't indicate what to change
+- `"invalid value"` → Doesn't specify what's invalid or how to fix it
+
+**Best Practice**: Error messages should:
+1. Clearly identify what's wrong (e.g., "branch not found")
+2. Imply what field needs to be changed (via the `Field` property)
+3. Guide users toward valid alternatives (e.g., "use an existing branch name")
+
+## Examples
+
+### Example 1: Connection Resource
+
+**Scenario**: A GitHub connection has an invalid installation ID.
+
+```yaml
+apiVersion: provisioning.grafana.app/v0alpha1
+kind: Connection
+metadata:
+ name: my-connection
+spec:
+ type: github
+ github:
+ appID: "123456"
+ installationID: "999999999" # Invalid installation ID
+status:
+ observedGeneration: 1
+ state: Disconnected
+ health:
+ healthy: false
+ checked: 1699123456
+ fieldErrors:
+ - type: FieldValueInvalid
+ field: spec.installationID
+ detail: "invalid installation ID: 999999999"
+```
+
+**Frontend Usage**:
+```typescript
+// Step 1: Validate before creation using dryRun
+const errors = await validateBeforeCreate(connectionSpec);
+if (errors.length > 0) {
+ errors.forEach(error => {
+ if (error.field === 'spec.installationID') {
+ setFieldError('installationID', error.message);
+ }
+ });
+ return; // Don't create if validation fails
+}
+
+// Step 2: Create resource if validation passes
+const conn = await api.createConnection(connectionSpec);
+
+// Step 3: Display any ongoing validation errors from status
+conn.status.fieldErrors?.forEach(error => {
+ if (error.field === 'spec.installationID') {
+ setFieldError('installationID', error.detail);
+ }
+});
+```
+
+### Example 2: Repository Resource
+
+**Scenario**: A repository references a non-existent branch.
+
+```yaml
+apiVersion: provisioning.grafana.app/v0alpha1
+kind: Repository
+metadata:
+ name: my-repo
+spec:
+ type: github
+ github:
+ url: "https://github.com/grafana/grafana"
+ branch: "non-existent-branch"
+status:
+ observedGeneration: 1
+ health:
+ healthy: false
+ checked: 1699123456
+ fieldErrors:
+ - type: FieldValueInvalid
+ field: spec.github.branch
+ detail: "branch not found"
+```
+
+**List View Usage**:
+```typescript
+// In a list of repositories
+repositories.map(repo => (
+
+ {repo.name}
+ {repo.status.fieldErrors?.length > 0 && (
+
+ {repo.status.fieldErrors[0].detail} // "Branch not found"
+
+ )}
+
+));
+```
+
+### Example 3: Multiple Field Errors
+
+**Scenario**: A repository has multiple validation issues.
+
+```yaml
+status:
+ fieldErrors:
+ - type: FieldValueInvalid
+ field: spec.github.url
+ detail: "repository not found" # Actionable: user should check URL or repository name
+ - type: FieldValueInvalid
+ field: secure.token
+ detail: "not authorized" # Actionable: user should update credentials
+```
+
+**User Actions**: Each error guides the user to fix a specific issue:
+1. **Repository URL error**: User should verify the repository exists and update `spec.github.url` if incorrect
+2. **Token error**: User should update `secure.token` with valid credentials
+
+**Frontend Usage**:
+```typescript
+// Display all errors grouped by field
+repo.status.fieldErrors?.forEach(error => {
+ const fieldId = mapFieldPathToFormField(error.field);
+ setFieldError(fieldId, error.detail);
+});
+```
+
+## Controller Implementation
+
+Controllers should populate `fieldErrors` during reconciliation by performing actual runtime validation checks:
+
+```go
+func (c *ResourceController) process(ctx context.Context, resource *Resource) error {
+ var fieldErrors []ErrorDetails
+
+ // Perform actual runtime validation checks
+ // These are checks that require operations, not just spec inspection
+
+ // Example: Check external system (GitHub API, Git repository, etc.)
+ if resource.Spec.Type == "github" {
+ // Check if installation exists
+ installation, err := c.githubClient.GetInstallation(ctx, resource.Spec.InstallationID)
+ if err != nil {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.installationID",
+ Detail: fmt.Sprintf("installation ID %s not found", resource.Spec.InstallationID),
+ })
+ }
+
+ // Check if repository exists
+ if installation != nil {
+ repoExists, err := c.githubClient.RepoExists(ctx, resource.Spec.RepositoryURL)
+ if err != nil || !repoExists {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueInvalid,
+ Field: "spec.repositoryURL",
+ Detail: "repository not found",
+ })
+ }
+ }
+ }
+
+ // Example: Check internal runtime state (resource conflicts, etc.)
+ conflicts, err := c.checkConflicts(ctx, resource)
+ if err == nil && len(conflicts) > 0 {
+ for _, conflict := range conflicts {
+ fieldErrors = append(fieldErrors, ErrorDetails{
+ Type: metav1.CauseTypeFieldValueDuplicate,
+ Field: conflict.Field,
+ Detail: conflict.Message,
+ })
+ }
+ }
+
+ // Build patch operations
+ patchOps := []map[string]interface{}{
+ {
+ "op": "replace",
+ "path": "/status/fieldErrors",
+ "value": fieldErrors,
+ },
+ }
+
+ // Apply status patch
+ return c.statusPatcher.Patch(ctx, resource, patchOps...)
+}
+```
+
+## Frontend Integration
+
+### Create/Update Flow
+
+**Recommended Pattern: Use `dryRun=true` for pre-creation validation**
+
+1. Validate before creation using `dryRun=true`
+2. Extract errors from HTTP 422 response (`details.causes`)
+3. Display errors inline without creating the resource
+4. If validation passes, create the resource
+5. After creation, read `status.fieldErrors` for ongoing validation (e.g., external system changes)
+
+**Pre-Creation Validation with `dryRun`**:
+```typescript
+// Validate before creation using dryRun (default approach)
+async function validateBeforeCreate(spec: ResourceSpec): Promise {
+ try {
+ await api.createResource(spec, { dryRun: true });
+ return []; // No errors
+ } catch (error) {
+ // Extract field errors from HTTP error response
+ if (error.status === 422 && error.body?.details?.causes) {
+ return error.body.details.causes.map((cause: any) => ({
+ field: cause.field,
+ message: cause.message,
+ type: cause.type,
+ }));
+ }
+ throw error; // Re-throw unexpected errors
+ }
+}
+
+// Create resource after validation
+async function createResource(spec: ResourceSpec) {
+ // Step 1: Validate first using dryRun
+ const errors = await validateBeforeCreate(spec);
+ if (errors.length > 0) {
+ // Display errors in form without creating resource
+ errors.forEach(error => {
+ setFieldError(mapFieldPathToFormField(error.field), error.message);
+ });
+ return;
+ }
+
+ // Step 2: Create resource if validation passes
+ const resource = await api.createResource(spec);
+
+ // Step 3: Read status.fieldErrors for ongoing validation
+ // (e.g., if external system state changes after creation)
+ if (resource.status.fieldErrors?.length > 0) {
+ resource.status.fieldErrors.forEach(error => {
+ setFieldError(mapFieldPathToFormField(error.field), error.detail);
+ });
+ }
+}
+```
+
+### List View
+
+```typescript
+// GET /resources returns list with status.fieldErrors
+resources.forEach(resource => {
+ const errors = resource.status.fieldErrors || [];
+ if (errors.length > 0) {
+ // Display badge with first error
+ showErrorBadge(resource.name, errors[0].detail);
+ }
+});
+```
+
+## Using `dryRun=true` for Pre-Creation Validation
+
+### Overview
+
+Kubernetes provides a standard `dryRun=true` parameter for validating requests without persisting changes. This allows any API client (frontend applications, kubectl, CLI tools, automation scripts) to validate resources before creation, addressing the concern of "avoiding creation of invalid objects."
+
+### Important: `fieldErrors` Are NOT Populated During `dryRun`
+
+**Critical Understanding**: When using `dryRun=true`, `fieldErrors` in status are **not populated** because:
+
+1. **Resource is not persisted**: `dryRun` skips storage writes, so the resource doesn't exist in the API server
+2. **Controllers don't run**: Controllers only reconcile persisted resources, so no controller reconciliation occurs
+3. **Status is empty/default**: Since no controller has run, the status field is empty or contains default values
+
+### What Happens During `dryRun=true`
+
+**Admission validation errors** are returned as HTTP errors (not `fieldErrors` in status):
+
+```go
+// Admission validator can perform runtime validation during dryRun
+func (v *AdmissionValidator) Validate(ctx context.Context, a admission.Attributes, o admission.ObjectInterfaces) error {
+ // Standard admission validation (format, structure)
+ if err := v.validateStructure(a.GetObject()); err != nil {
+ return err
+ }
+
+ // If dryRun, also run runtime validation (external systems, internal state)
+ if a.IsDryRun() {
+ if err := v.validateRuntime(ctx, a.GetObject()); err != nil {
+ // Returns apierrors.NewInvalid() with field.ErrorList
+ return err
+ }
+ }
+
+ return nil
+}
+
+func (v *AdmissionValidator) validateRuntime(ctx context.Context, obj runtime.Object) error {
+ repo := obj.(*provisioning.Repository)
+
+ // Run same validation checks that controller would do
+ // Check external systems
+ if exists, err := v.gitClient.RepoExists(ctx, repo.Spec.GitHub.URL); err != nil || !exists {
+ return apierrors.NewInvalid(
+ provisioning.RepositoryGroupVersionKind.GroupKind(),
+ repo.Name,
+ field.ErrorList{
+ field.Invalid(field.NewPath("spec", "github", "url"), repo.Spec.GitHub.URL, "repository not found"),
+ },
+ )
+ }
+
+ // Check branch exists
+ branchRef := fmt.Sprintf("refs/heads/%s", repo.Spec.GitHub.Branch)
+ if _, err := v.gitClient.GetRef(ctx, branchRef); err != nil {
+ return apierrors.NewInvalid(
+ provisioning.RepositoryGroupVersionKind.GroupKind(),
+ repo.Name,
+ field.ErrorList{
+ field.Invalid(field.NewPath("spec", "github", "branch"), repo.Spec.GitHub.Branch, "branch not found"),
+ },
+ )
+ }
+
+ return nil
+}
+```
+
+**Response format** (HTTP 422 Unprocessable Entity):
+```json
+{
+ "kind": "Status",
+ "apiVersion": "v1",
+ "status": "Failure",
+ "message": "Repository \"my-repo\" is invalid",
+ "reason": "Invalid",
+ "code": 422,
+ "details": {
+ "name": "my-repo",
+ "group": "provisioning.grafana.app",
+ "kind": "Repository",
+ "causes": [
+ {
+ "type": "FieldValueInvalid",
+ "field": "spec.github.branch",
+ "message": "branch not found"
+ }
+ ]
+ }
+}
+```
+
+### Two-Phase Validation Strategy
+
+**Phase 1: `dryRun=true` (Pre-Creation)**
+- Admission webhooks run (including runtime validation if implemented)
+- Errors returned as HTTP 422 with field details in `details.causes`
+- Resource is **not persisted** (no storage write)
+- **No `fieldErrors` in status** (resource doesn't exist, controllers don't run)
+
+**Phase 2: After Creation (Post-Creation)**
+- Resource is persisted
+- Controllers reconcile and populate `status.fieldErrors`
+- Errors available in `status.fieldErrors` array
+- Used for ongoing validation (e.g., external system changes after creation)
+
+### Client Implementation Patterns
+
+**Pre-Creation Validation with `dryRun`** can be used by any API client. This is the **default approach** for runtime validation:
+
+**Frontend (TypeScript)**:
+```typescript
+// Step 1: Validate before creation using dryRun (default approach)
+async function validateBeforeCreate(spec: ResourceSpec): Promise {
+ try {
+ await api.createResource(spec, { dryRun: true });
+ return []; // No errors
+ } catch (error) {
+ // Extract field errors from HTTP error response
+ if (error.status === 422 && error.body?.details?.causes) {
+ return error.body.details.causes.map((cause: any) => ({
+ field: cause.field,
+ message: cause.message,
+ type: cause.type,
+ }));
+ }
+ throw error; // Re-throw unexpected errors
+ }
+}
+
+// Step 2: Create resource if validation passes
+async function createResource(spec: ResourceSpec) {
+ // Validate first using dryRun
+ const errors = await validateBeforeCreate(spec);
+ if (errors.length > 0) {
+ // Display errors in form without creating resource
+ errors.forEach(error => {
+ setFieldError(mapFieldPathToFormField(error.field), error.message);
+ });
+ return;
+ }
+
+ // Create resource
+ const resource = await api.createResource(spec);
+
+ // Step 3: Read status.fieldErrors for ongoing validation
+ // (e.g., if external system state changes after creation)
+ if (resource.status.fieldErrors?.length > 0) {
+ resource.status.fieldErrors.forEach(error => {
+ setFieldError(mapFieldPathToFormField(error.field), error.detail);
+ });
+ }
+}
+```
+
+**kubectl**:
+```bash
+# Validate before creating
+kubectl apply --dry-run=client -f repository.yaml
+
+# If validation passes, create the resource
+kubectl apply -f repository.yaml
+
+# Check status.fieldErrors after creation
+kubectl get repository my-repo -o jsonpath='{.status.fieldErrors}'
+```
+
+**curl/API Client**:
+```bash
+# Validate before creating (dryRun)
+curl -X POST \
+ "https://api.example.com/apis/provisioning.grafana.app/v0alpha1/repositories?dryRun=All" \
+ -H "Content-Type: application/json" \
+ -d @repository.json
+
+# If validation passes (no 422 error), create the resource
+curl -X POST \
+ "https://api.example.com/apis/provisioning.grafana.app/v0alpha1/repositories" \
+ -H "Content-Type: application/json" \
+ -d @repository.json
+
+# Check status.fieldErrors after creation
+curl "https://api.example.com/apis/provisioning.grafana.app/v0alpha1/repositories/my-repo" \
+ | jq '.status.fieldErrors'
+```
+
+**Why Both Phases?**
+
+1. **`dryRun` (Phase 1)**: Prevents creating invalid resources, provides immediate feedback to any API client (frontend, kubectl, CLI tools, automation scripts)
+2. **Status `fieldErrors` (Phase 2)**: Handles errors that occur after creation (e.g., external system changes, transient failures, ongoing validation)
+
+### Admission Webhook Implementation
+
+To enable runtime validation during `dryRun`, implement validation in your admission webhook:
+
+```go
+func (v *AdmissionValidator) Validate(ctx context.Context, a admission.Attributes, o admission.ObjectInterfaces) error {
+ obj := a.GetObject()
+ if obj == nil {
+ return nil
+ }
+
+ repo := obj.(*provisioning.Repository)
+
+ // Standard admission validation (format, structure, static rules)
+ if err := v.validateStructure(repo); err != nil {
+ return err
+ }
+
+ // If dryRun, also run runtime validation (external systems, internal state)
+ // This allows full validation without persisting the resource
+ if a.IsDryRun() {
+ if err := v.validateRuntime(ctx, repo); err != nil {
+ return err // Return errors immediately - resource won't be created
+ }
+ }
+
+ return nil
+}
+
+func (v *AdmissionValidator) validateRuntime(ctx context.Context, repo *provisioning.Repository) error {
+ var list field.ErrorList
+
+ // Run same validation checks that controller would do
+ // Check external systems
+ exists, err := v.gitClient.RepoExists(ctx, repo.Spec.GitHub.URL)
+ if err != nil {
+ list = append(list, field.Invalid(
+ field.NewPath("spec", "github", "url"),
+ repo.Spec.GitHub.URL,
+ fmt.Sprintf("failed to check if repository exists: %v", err),
+ ))
+ } else if !exists {
+ list = append(list, field.Invalid(
+ field.NewPath("spec", "github", "url"),
+ repo.Spec.GitHub.URL,
+ "repository not found",
+ ))
+ }
+
+ // Check branch exists
+ if exists {
+ branchRef := fmt.Sprintf("refs/heads/%s", repo.Spec.GitHub.Branch)
+ _, err := v.gitClient.GetRef(ctx, branchRef)
+ if err != nil {
+ if errors.Is(err, nanogit.ErrObjectNotFound) {
+ list = append(list, field.Invalid(
+ field.NewPath("spec", "github", "branch"),
+ repo.Spec.GitHub.Branch,
+ "branch not found",
+ ))
+ } else {
+ list = append(list, field.Invalid(
+ field.NewPath("spec", "github", "branch"),
+ repo.Spec.GitHub.Branch,
+ fmt.Sprintf("failed to check if branch exists: %v", err),
+ ))
+ }
+ }
+ }
+
+ if len(list) > 0 {
+ return apierrors.NewInvalid(
+ provisioning.RepositoryGroupVersionKind.GroupKind(),
+ repo.Name,
+ list,
+ )
+ }
+
+ return nil
+}
+```
+
+### Key Takeaways
+
+1. **`dryRun=true` prevents creation**: Resource is not persisted, so invalid objects are never created
+2. **Admission errors during `dryRun`**: Returned as HTTP 422 with field details in `details.causes`
+3. **`fieldErrors` are NOT in status during `dryRun`**: Controllers don't run, so status is empty
+4. **Two-phase validation**: Use `dryRun` for pre-creation validation, `status.fieldErrors` for post-creation validation
+5. **Clients extract errors from HTTP response**: During `dryRun`, errors come from the HTTP error response (HTTP 422), not from `status.fieldErrors`. This works for any API client (frontend, kubectl, curl, automation scripts)
+
+### Comparison: `dryRun` vs `status.fieldErrors`
+
+| Aspect | `dryRun=true` | `status.fieldErrors` |
+| ---------------------- | ------------------------------ | ------------------------------ |
+| **When** | Before resource creation | After resource creation |
+| **Resource persisted** | ❌ No | ✅ Yes |
+| **Controllers run** | ❌ No | ✅ Yes |
+| **Error format** | HTTP 422 with `details.causes` | `status.fieldErrors` array |
+| **Use case** | Pre-creation validation | Post-creation validation |
+| **Prevents creation** | ✅ Yes | ❌ No (resource already exists) |
+| **Ongoing validation** | ❌ No | ✅ Yes (continuously updated) |
+
+## Custom Validation Endpoints Antipattern
+
+### Why Custom Validation Endpoints Are Problematic
+
+Creating separate endpoints for validation (such as `/test`, `/validate`, `/check`, etc.) violates several important principles:
+
+#### 1. **Separation of Concerns**
+
+**Problem**: Custom validation endpoints create a separate API surface that duplicates validation logic.
+
+```mermaid
+graph LR
+ A[Frontend] -->|POST /resource/test| B[Validation Endpoint]
+ A -->|POST /resource| C[Create Endpoint]
+ B -->|Validation| D[Validator]
+ C -->|Validation| D
+ style B fill:#ff9999
+ style D fill:#99ff99
+```
+
+**Solution**: Use `dryRun=true` for pre-creation validation, then `status.fieldErrors` for ongoing validation.
+
+```mermaid
+graph LR
+ A[Frontend] -->|POST /resource?dryRun=true| B[Admission Validator]
+ B -->|Runtime Validation| C[External Systems]
+ B -->|HTTP 422 Errors| A
+ A -->|POST /resource| D[Create Endpoint]
+ D -->|Reconcile| E[Controller]
+ E -->|Update| F[Status.fieldErrors]
+ A -->|GET /resource| F
+ style B fill:#99ff99
+ style F fill:#99ff99
+```
+
+#### 2. **Stale Data**
+
+**Problem**: Test results are only as fresh as the last test call.
+
+```typescript
+// ❌ Antipattern: Custom validation endpoint
+const testResult = await api.post('/repositories/test', spec);
+// or: const validateResult = await api.post('/repositories/validate', spec);
+// testResult.errors may be stale if resource state changed
+
+// ✅ Correct: Status always current
+const repo = await api.get('/repositories/my-repo');
+// repo.status.fieldErrors always reflects current state
+```
+
+#### 3. **List Operations Don't Work**
+
+**Problem**: Can't see validation errors when listing resources.
+
+```typescript
+// ❌ Antipattern: Must call validation endpoint for each resource
+const repos = await api.get('/repositories');
+for (const repo of repos) {
+ const testResult = await api.post(`/repositories/${repo.name}/test`, repo.spec);
+ // or: const validateResult = await api.post(`/repositories/${repo.name}/validate`, repo.spec);
+ // Multiple API calls, inefficient
+}
+
+// ✅ Correct: Errors included in list response
+const repos = await api.get('/repositories');
+repos.forEach(repo => {
+ // repo.status.fieldErrors already available
+});
+```
+
+#### 4. **CLI Tools Complexity**
+
+**Problem**: CLI tools must make separate validation endpoint calls, requiring knowledge of custom endpoints.
+
+```bash
+# ❌ Antipattern: Must call separate validation endpoint
+kubectl apply -f repo.yaml
+# Now need to manually call validation endpoint
+curl -X POST https://grafana.example.com/apis/provisioning.grafana.app/v0alpha1/namespaces/default/repositories/my-repo/test \
+ -H "Content-Type: application/json" \
+ -d @repo.yaml
+# Or use a custom CLI tool that wraps this endpoint
+
+# ✅ Correct: Validate before creation using dryRun, then check status
+kubectl apply --dry-run=client -f repo.yaml # Validate first
+kubectl apply -f repo.yaml # Create if validation passes
+kubectl get repository my-repo -o jsonpath='{.status.fieldErrors}' # Check ongoing validation
+```
+
+#### 5. **Inconsistent Patterns**
+
+**Problem**: Different resources use different patterns.
+
+```typescript
+// ❌ Antipattern: Inconsistent validation endpoints
+if (resourceType === 'repository') {
+ await api.post('/repositories/test', spec);
+} else if (resourceType === 'connection') {
+ await api.post('/connections/validate', spec); // Different endpoint name!
+} else if (resourceType === 'dashboard') {
+ await api.post('/dashboards/check', spec); // Yet another name!
+} else {
+ // No validation endpoint? Check status?
+}
+
+// ✅ Correct: Unified pattern
+const resource = await api.get(`/${resourceType}/${name}`);
+const errors = resource.status.fieldErrors || [];
+```
+
+#### 6. **Not Discoverable**
+
+**Problem**: Custom validation endpoints are separate subresources that must be discovered. Different resources may use different endpoint names (`/test`, `/validate`, `/check`, etc.), making them hard to find.
+
+```typescript
+// ❌ Antipattern: Must know about validation endpoint
+// Is it /test? /validate? /check? Not obvious from OpenAPI spec
+// Different resources may use different names
+
+// ✅ Correct: Errors are part of standard status
+// Always available, always discoverable, consistent naming
+```
+
+### Migration from Custom Validation Endpoints
+
+If you have existing custom validation endpoints (e.g., `/test`, `/validate`, `/check`):
+
+1. **Add `fieldErrors` to status**: Implement controller logic to populate `fieldErrors`
+2. **Mark endpoint as deprecated**: Update OpenAPI spec with deprecation notice
+3. **Update documentation**: Point users to `status.fieldErrors`
+4. **Frontend migration**: Update frontend to use status instead of validation endpoint
+5. **Remove endpoint**: After migration period, remove the validation endpoint
+
+## Benefits Summary
+
+| Aspect | Custom Validation Endpoints | `fieldErrors` in Status |
+| ------------------------ | --------------------------------------------- | --------------------------- |
+| **List operations** | ❌ Requires separate calls | ✅ Included automatically |
+| **Real-time updates** | ❌ Only when called | ✅ Continuously updated |
+| **CLI tools** | ❌ Complex, separate calls | ✅ Standard kubectl patterns |
+| **Frontend** | ⚠️ Works but awkward | ✅ Natural integration |
+| **Discoverability** | ❌ Separate endpoint, inconsistent naming | ✅ Part of standard status |
+| **Consistency** | ❌ Resource-specific, different endpoint names | ✅ Unified pattern |
+| **Kubernetes alignment** | ❌ Custom pattern | ✅ Follows conventions |
+
+## Checklist for New Resources
+
+When creating a new resource that needs validation:
+
+### Admission Validation (Static Rules)
+- [ ] Implement admission validator for format/structure/type validation
+- [ ] Validate required fields, enums, and data types in admission
+- [ ] Use OpenAPI schema for syntactic validation
+- [ ] Return admission errors before resource is created/updated
+
+### Status fieldErrors (External/Dynamic)
+- [ ] Add `fieldErrors []ErrorDetails` to resource status
+- [ ] Implement controller logic to populate `fieldErrors` during reconciliation
+- [ ] **Only use `fieldErrors` for external system or dynamic state validation**
+- [ ] **Do NOT use `fieldErrors` for format validation or static rules** (use admission instead)
+- [ ] Map validation errors to field paths (e.g., `spec.github.branch`)
+- [ ] Include `Type`, `Field`, and `Detail` in all errors
+- [ ] Add integration tests verifying `fieldErrors` are populated
+- [ ] Document field path mappings for frontend
+- [ ] **Do NOT** create custom validation endpoints (e.g., `/test`, `/validate`, `/check`)
+
+## References
+
+- [Kubernetes API Conventions - Status](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status)
+- [Kubernetes API Conventions - Resources](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#resources)
+- Related PR: [Add fieldErrors to Connection and Repository status](https://github.com/grafana/grafana/pull/116662)
diff --git a/docs/architecture/version-validation-slack-explanation.md b/docs/architecture/version-validation-slack-explanation.md
new file mode 100644
index 000000000..8eac6962f
--- /dev/null
+++ b/docs/architecture/version-validation-slack-explanation.md
@@ -0,0 +1,81 @@
+# Version-Specific Validation Problem Scenario
+
+## The Problem
+
+When we change validation rules in the backend, frontend applications that have hardcoded validation logic can become out of sync. This creates a mismatch where frontend and backend enforce different rules.
+
+## Example Scenario: Branch Name Validation
+
+**Initial Backend Rules**: Branch names must match pattern `^[a-z0-9-]+$` (only lowercase letters, numbers, and hyphens)
+- Users can enter: `feature-branch`, `main`, `v1-0-0`
+- Users cannot enter: `feature.user.login`, `user_login` (dots and underscores not allowed)
+
+**Backend Changes**: We update the backend to allow dots and underscores in branch names → pattern becomes `^[a-z0-9._-]+$`
+- Users can now enter: `feature.user.login`, `user_login`, `feature-branch`
+- The text/format allowed in the input field changes
+
+## What Happens
+
+**Scenario 1: Backend changed, frontend not updated**
+- Backend now allows `feature.user.login` (dots allowed) - input field text format changed
+- Frontend still has old validation rules hardcoded (rejects `feature.user.login`)
+- **Result**: User enters valid text in input field, but frontend rejects it before submission → user confusion
+- **User sees**: Frontend error message like "Branch name can only contain lowercase letters, numbers, and hyphens" (outdated message)
+- **What should happen**: Backend accepts `feature.user.login` and resource is created successfully
+
+**Scenario 2: Frontend accepts, backend rejects**
+- Frontend doesn't validate dots, accepts `feature.user.login` (user enters text in input field)
+- Backend still enforces old rules, rejects `feature.user.login` (dots not allowed)
+- **Result**: User enters text in input field, submits form, backend rejects it → confusion
+- **User sees**: HTTP 422 error with message like "Repository 'my-repo' is invalid" and details:
+ ```json
+ {
+ "details": {
+ "causes": [
+ {
+ "type": "FieldValueInvalid",
+ "field": "spec.github.branch",
+ "message": "branch name must match pattern ^[a-z0-9-]+$ for API version v1alpha1"
+ }
+ ]
+ }
+ }
+ ```
+- **User experience**: User entered text, clicked submit, got error after submission → poor UX
+
+## The Core Issue
+
+When we change validation rules in the backend:
+- The text/format allowed in input fields changes
+- Frontend with hardcoded validation becomes out of sync
+- Users get confusing errors when frontend and backend don't match
+- Users may enter text that frontend accepts but backend rejects (or vice versa)
+- Duplicating validation logic in frontend creates maintenance burden
+- Frontend must be updated every time we change backend validation rules
+
+## Translation/Details Challenge
+
+The problem is about:
+- **Details**: Which validation rules does the backend currently enforce?
+- **Translation**: How do we ensure frontend validation matches backend validation when backend rules change?
+- **Error Messages**: What error text should be displayed to users? Should it mention the API version? Should it show the pattern?
+- **fieldError Details**: When backend returns validation errors, they include:
+ ```json
+ {
+ "type": "FieldValueInvalid",
+ "field": "spec.github.branch",
+ "detail": "branch name must match pattern ^[a-z0-9-]+$ for API version v1alpha1"
+ }
+ ```
+ - **Type**: Machine-readable error type (FieldValueInvalid, FieldValueRequired, etc.)
+ - **Field**: JSON path to the field (e.g., `spec.github.branch`)
+ - **Detail**: Human-readable error message that should guide the user
+ - **Origin**: (Optional) Where the error originated (validator name, service, etc.)
+
+## The Solution
+
+Use version-specific APIs so that:
+- Different API versions can have different validation rules
+- Backend validates based on the API version in the request
+- Frontend doesn't need to duplicate validation logic - it relies on backend validation
+- When backend changes rules, we introduce a new API version rather than changing existing ones
diff --git a/docs/writing-a-reconciler.md b/docs/writing-a-reconciler.md
index ee9150601..2abd5a484 100644
--- a/docs/writing-a-reconciler.md
+++ b/docs/writing-a-reconciler.md
@@ -23,7 +23,7 @@ When writing a reconciler, it's important to take a few things into consideratio
* If you make an update to the object you're doing the reconcile (or watch) event for, this will trigger _another_ reconcile (or watch) event. Generally, favor only updating subresources (specifically `status`) and some metadata in your reconcile (or watch) events, as a `status` update should not trigger the `metadata.generation` value to increase (only `metadata.resourceVersion`), which will allow you to filter events out. Using the `operator.OpinionatedWatcher` will filter these events for you, but you will need to track this yourself in a Reconciler; if you prefer not to use OpinionatedWatcher or want to do your own event filtering, keep in mind how updates within your reconcile loop will be received.
* The reconciler is taking action on _every_ consumed event. Finding ways to escape from a reconcile or watcher event early will help your overall program logic.
* All objects for the kind(s) you are watching are cached to memory by default. This can be customized by using a different informer implementation, such as `operator.CustomCacheInformer`. Custom informers can be used in `simple.App` with `AppConfig.InformerConfig.InformerSupplier`, or by using your own custom `app.App` implementation.
-* Don't rely on retries to track operator state; use the `status` subresource to track operator success/failure, so that your operator can work out state from a fresh start (a restart will remove all pending retries, which are stored purely in-memory). This also allows a user to track operator status by viewing the `status` subresource.
+* Don't rely on retries to track operator state; use the `status` subresource to track operator success/failure, so that your operator can work out state from a fresh start (a restart will remove all pending retries, which are stored purely in-memory). This also allows a user to track operator status by viewing the `status` subresource. For runtime validation errors (e.g., external system checks, dynamic state validation), populate `fieldErrors` in the status. See [Validation Patterns](./architecture/validation-patterns.md) for the recommended convention.
* If your reconcile process makes requests for other resources, consider caching, as high-traffic objects may cause your application to have to make these requests extremely frequently.
* If your operator has a watcher or reconciler that updates the resource in a deterministic way (such as adding a label based on the spec), consider adding mutation for the kind on your App instead, as it makes that process synchronous and will never leave the object in an intermediate state (and reduces calls to the API server from your operator). Mutation can be added for a kind in `simple.App` with `AppConfig.ManagedKinds[].Mutator`, or by implementing the behavior in `Mutate` if you're using a custom `app.App` implementation (don't forget to add mutation in your manifest as well).
* When you have multiple versions of a kind, your reconciliation should only deal with one of them (typically the latest), as events are always issued for any version as the version requested by the operator's watch (so a user creating a `v1` version of a resource will still produce a `v2` version of that resource in a watch request for the `v2` of the kind).