Skip to content

Add automation service subchart#464

Merged
malhotra5 merged 29 commits into
mainfrom
add-automation-chart
Mar 26, 2026
Merged

Add automation service subchart#464
malhotra5 merged 29 commits into
mainfrom
add-automation-chart

Conversation

@malhotra5

@malhotra5 malhotra5 commented Mar 19, 2026

Copy link
Copy Markdown
Member

Summary

This PR adds the automation helm chart that was previously located in the deploy repo (deploy/automation/chart). This follows the same pattern as the openhands chart, centralizing helm charts in the OpenHands-Cloud repository.

Changes

New Automation Chart (charts/automation/)

  • Chart.yaml: Defines the automation chart with PostgreSQL dependency
  • values.yaml: Default values including:
    • Image configuration
    • Deployment resources and replicas
    • Service account settings
    • Probes (startup, liveness, readiness)
    • Ingress configuration
    • Database settings (SQLite or PostgreSQL)
    • GCP Cloud SQL support
    • Datadog integration
  • Templates:
    • _env.yaml: Environment variable helper template
    • deployment.yaml: Kubernetes deployment with init containers for migrations
    • service.yaml: ClusterIP service exposing port 80
    • ingress.yaml: Optional ingress with TLS support
    • service-account.yaml: Optional service account creation

Workflow Updates

  • publish-helm-charts.yml: Added automation chart to the publish matrix
  • preview-helm-charts.yml: Added automation chart to detect-changes, publish-charts, and lint-and-test jobs
  • validate-chart-versions.yml: Added automation-publishable output

Testing

After this PR is merged, the deploy repo will be updated to:

  1. Reference the automation chart from oci://ghcr.io/all-hands-ai/helm-charts/automation
  2. Use a shared CLOUD_CHART_SPEC environment variable for both openhands and automation charts
  3. Add CI validation to ensure released versions are used on main branch

Related

Related to APP-1001

Related to ALL-5577

This is part of a larger effort to move helm charts from the deploy repo to the OpenHands-Cloud repo for better organization and version management.

openhands-agent and others added 10 commits March 19, 2026 19:33
- Add automation helm chart with deployment, service, ingress, and service account templates
- Support for PostgreSQL database (ephemeral or GCP Cloud SQL)
- Datadog integration for monitoring
- Update publish-helm-charts workflow to include automation chart
- Update preview-helm-charts workflow to include automation chart
- Update validate-chart-versions workflow to include automation-publishable output

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Add AUTOMATION_SERVICE_KEY environment variable configuration to both charts:

- automation chart: Add serviceKeyFromSecret config for authenticating
  requests from OpenHands SaaS to the automation service
- openhands chart: Add automationServiceKey config for providing the
  service key to the OpenHands app

Both are disabled by default and can be enabled via values overrides.

Co-authored-by: openhands <openhands@all-hands.dev>
The SaaS app expects AUTOMATIONS_SERVICE_KEY (plural) while the
automation service expects AUTOMATION_SERVICE_KEY (singular).

Co-authored-by: openhands <openhands@all-hands.dev>
Uncomment the serviceKeyFromSecret configuration so AUTOMATION_SERVICE_KEY
env var is set by default when the secret exists.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Bump version to trigger preview chart publish for PR testing.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
- Add gcs.enabled and gcs.bucket configuration in values.yaml
- Add GCS_BUCKET_NAME env var when gcs.enabled is true

Co-authored-by: openhands <openhands@all-hands.dev>
@malhotra5 malhotra5 force-pushed the add-automation-chart branch from d006149 to c65e542 Compare March 20, 2026 15:13
- Add gcsEmulator configuration in values.yaml with fake-gcs-server settings
- Add STORAGE_EMULATOR_HOST and GCS_BUCKET_NAME env vars when emulator is enabled
- Add fake-gcs-server as sidecar container in deployment when enabled
- Uses in-memory backend for completely ephemeral storage

Co-authored-by: openhands <openhands@all-hands.dev>
Add openhandsApi config to dynamically construct the API base URL:
- baseUrl: explicit URL (for staging/production)
- host + prefixWithBranch: auto-construct https://{branch}.{host} (for feature envs)

This eliminates the need to hardcode branch-specific URLs in feature environment values.

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
@malhotra5 malhotra5 force-pushed the add-automation-chart branch from 0591644 to 1995b74 Compare March 23, 2026 23:38
malhotra5 and others added 3 commits March 23, 2026 19:39
…backs

The automation service needs its own public URL to construct callback URLs
that sandboxes use to report completion status. Without this, the service
falls back to http://localhost:8000 which is unreachable from sandboxes.

The URL is auto-constructed from ingress settings:
- With prefixWithBranch: https://automation-{branch}.{host}
- Without prefix: https://{host}

Co-authored-by: openhands <openhands@all-hands.dev>

Copy link
Copy Markdown
Member

👋 Hi! This is OpenHands assisting @jpshackelford with reviewing this PR.

After analyzing the PR and discussing requirements, we'd like to share some considerations for the automation chart integration:

Helm-Based Install Requirements

For the standard helm-based install, we'd like the automation chart to follow the subchart pattern used by runtime-api rather than being a standalone chart. This means:

Changes Needed to charts/openhands/Chart.yaml

Add automation as a conditional dependency:

dependencies:
  # ... existing dependencies ...
  - name: automation
    repository: oci://ghcr.io/all-hands-ai/helm-charts
    version: 0.1.x
    condition: automation.enabled

Changes Needed to charts/openhands/values.yaml

Add a nested automation: configuration block (similar to runtime-api:):

automation:
  enabled: false  # Disabled by default, users enable via flag
  
  image:
    repository: ghcr.io/openhands/deploy-automation
    # tag: ...
  
  # Share parent's PostgreSQL instead of deploying separate instance
  postgresql:
    enabled: false
    postMigrate: false  # Set true when parent chart owns PostgreSQL
  
  database:
    host: oh-main-postgresql
    port: "5432"
    user: postgres
    name: automations
    secretName: postgres-password
    secretKey: password
  
  ingress:
    enabled: false
    class: traefik
  
  # ... other overridable values

Database Initialization

Add the automations database to the PostgreSQL init scripts ConfigMap so it's created alongside other databases.

Consolidate Service Key Configuration

The existing automationServiceKey section could be consolidated into the automation: subchart config block for cleaner organization.


Replicated Install Considerations

For Replicated deployments, there are additional constraints:

  1. No values file access — Users cannot configure via a values file directly. Any configurable parameters must be surfaced either in:

    • A superadmin UI within the product, OR
    • The Replicated installer graphical UI
  2. Identify configurable parameters — We should document which automation settings need to be user-configurable and ensure they're exposed through the appropriate UI.

  3. Subdomain routing — Ideally, we'd avoid requiring a new subdomain for the automation service. If possible, route requests through the existing app server (e.g., /api/automations/*) rather than requiring automation.example.com.

  4. Chart templatization — Additional work will be needed to templatize the chart and values file for the Replicated context.


Summary

Install Method Requirement
Helm Subchart of openhands chart with automation.enabled flag
Replicated Config via UI, prefer app server routing over new subdomain

Happy to discuss further or help with implementation!

- Add automation as dependency in openhands Chart.yaml
- Add automation configuration block to openhands values.yaml
- Create ingress-automation.yaml for /automation subpath routing
- Create middleware-automation.yaml for Traefik path stripping
- Fix automation image repository to ghcr.io/openhands/automation

Co-authored-by: openhands <openhands@all-hands.dev>
- Bump openhands chart version to 0.2.18
- Fix automation dependency to use specific version 0.1.0 (OCI doesn't support wildcards)

Co-authored-by: openhands <openhands@all-hands.dev>
@malhotra5 malhotra5 force-pushed the add-automation-chart branch from fe32ea8 to f41816c Compare March 25, 2026 17:05
- Remove charts/automation/templates/ingress.yaml entirely
- Simplify _env.yaml to require parent chart to provide URLs
- Add automationBaseUrl value for callback URL configuration
- Simplify openhandsApi to just baseUrl (no dynamic construction)
- Update openhands chart values to document subpath routing

Co-authored-by: openhands <openhands@all-hands.dev>
- Add host and prefixWithBranch values to automation chart
- Update _env.yaml to construct URLs with /automation subpath
- URLs are now: https://{host}/automation or https://{branch}.{host}/automation
- Remove automationBaseUrl and openhandsApi.baseUrl in favor of host config

Co-authored-by: openhands <openhands@all-hands.dev>
- Reorder matrix so automation and runtime-api are published before openhands
- Add step to update automation dependency version in openhands Chart.yaml
- This ensures the automation preview chart is available when openhands tries to download it

Co-authored-by: openhands <openhands@all-hands.dev>
The automation subchart creates a service named 'automation' (hardcoded),
not '{{ .Release.Name }}-automation'. Update the ingress to use the correct
service name.

Co-authored-by: openhands <openhands@all-hands.dev>
- Add database.useSharedPostgres option to use parent's PostgreSQL instance
- When enabled, constructs DB host from branchSanitized (openhands-{branch}-postgresql)
- Add initContainer to create automation database and user in shared PostgreSQL
- Keeps separate PostgreSQL subchart option for standalone deployments
- Staging/production continue to use Cloud SQL (useSharedPostgres=false)

This saves resources in feature environments by sharing PostgreSQL with Keycloak
while maintaining separate databases and users for isolation.

Co-authored-by: openhands <openhands@all-hands.dev>
Update ingress, middleware, and URL construction to serve automation
under /api/automation instead of /automation.

Co-authored-by: openhands <openhands@all-hands.dev>
For both feature and staging environments, the automation service should
dispatch sandbox runs to the main staging environment (staging.all-hands.dev),
not branch-specific deployments.

- Feature env: https://staging.all-hands.dev (not https://{branch}.staging.all-hands.dev)
- Staging env: https://staging.all-hands.dev
- Production env: https://app.all-hands.dev

The AUTOMATION_BASE_URL (callback URL) still respects prefixWithBranch
since callbacks need to reach the specific automation instance.

Co-authored-by: openhands <openhands@all-hands.dev>
@malhotra5 malhotra5 changed the title DRAFT: Add automation chart from deploy repo Add automation service subchart Mar 25, 2026
@malhotra5 malhotra5 marked this pull request as ready for review March 25, 2026 20:44
@xingyaoww xingyaoww requested a review from all-hands-bot March 26, 2026 10:14
Comment thread charts/automation/templates/_env.yaml Outdated

@xingyaoww xingyaoww left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - but would love a second eye from someone who has more expertise on helm chart

Comment thread charts/automation/templates/_env.yaml
Comment thread charts/automation/templates/deployment.yaml Outdated
Comment thread charts/automation/values.yaml Outdated
Comment thread charts/automation/values.yaml Outdated
Comment thread charts/automation/values.yaml Outdated
@jlav

jlav commented Mar 26, 2026

Copy link
Copy Markdown
Member

I left some blocking comments but if the goal of this PR is just to move the chart from one place to another, I'm totally happy with just creating issues that we can follow up on later.

@xingyaoww

Copy link
Copy Markdown
Member

@OpenHands please read all @jlav's review comment above, then open a new PR, following https://docs.openhands.dev/openhands/usage/use-cases/code-review and add .agents/skills/code-review.md that encapsulate the core code review principles you can learn from jlav's comment for this repo.

Also in that PR, you should add the same workflow https://github.com/OpenHands/software-agent-sdk/blob/main/.github/workflows/pr-review-by-openhands.yml to this repo (expect env var to be already set).

@openhands-ai

openhands-ai Bot commented Mar 26, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

xingyaoww pushed a commit that referenced this pull request Mar 26, 2026
- Add .agents/skills/code-review.md with repo-specific Helm chart review
  guidelines distilled from PR #464 review feedback
- Add .github/workflows/pr-review-by-openhands.yml for automated PR reviews

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai

openhands-ai Bot commented Mar 26, 2026

Copy link
Copy Markdown

Summary of Work Completed

I created PR #482 (Add code review skill and automated PR review workflow) on a new branch add-code-review-skill from main, as requested in the comment on PR #464.

Checklist — Was the request fully addressed?

  • Read all of @jlav's review comments — Fetched and analyzed all 5 review comments from jlav on PR Add automation service subchart #464 (env var de-duplication, deployment scaling, storage abstraction, clear toggle naming, simplifying conditional logic)
  • Created .agents/skills/code-review.md — Distilled jlav's feedback into 6 core Helm chart-specific code review principles, following the format from the OpenHands code review docs and using the /codereview trigger to override default review behavior
  • Added .github/workflows/pr-review-by-openhands.yml — Copied the exact same workflow from software-agent-sdk, including the same trigger conditions, concurrency settings, composite action reference, and secret references (assuming env vars are already set as instructed)
  • Opened a new PR (not modifying the existing PR Add automation service subchart #464) — PR Add code review skill and automated PR review workflow #482 targets main

Are the changes concise?

Yes — the PR contains exactly two new files with no extraneous changes:

  1. .agents/skills/code-review.md — repo-specific review guidelines
  2. .github/workflows/pr-review-by-openhands.yml — automated PR review workflow

No other files were modified, and no temporary or unnecessary files were created.

xingyaoww added a commit that referenced this pull request Mar 26, 2026
- Add .agents/skills/code-review.md with repo-specific Helm chart review
  guidelines distilled from PR #464 review feedback
- Add .github/workflows/pr-review-by-openhands.yml for automated PR reviews

Co-authored-by: openhands <openhands@all-hands.dev>
openhands-agent and others added 3 commits March 26, 2026 14:45
Changes made based on review feedback:

1. Implement env var de-dupe pattern (blocking)
   - Added automation.env.defaults and automation.env templates
   - Follows the same pattern as openhands chart
   - Prevents duplicate env var warnings and merge conflicts

2. Simplify host configuration (blocking)
   - Removed host/prefixWithBranch toggles
   - Added explicit openhandsApiUrl and automationBaseUrl
   - Consumers pass full URLs directly

3. Simplify database configuration (blocking)
   - Removed useSharedPostgres toggle
   - Added createDatabaseUser as a clear, single-purpose toggle
   - Consumers pass full database.host directly

4. Align storage with openhands chart pattern (non-blocking)
   - Removed gcs/gcsEmulator configuration
   - Added filestore config matching openhands pattern
   - Uses minio for ephemeral environments (S3-compatible)
   - Supports gcs, s3, or minio backends

Co-authored-by: openhands <openhands@all-hands.dev>
When automation is deployed as a subchart of openhands, it should use
the parent chart's minio instance instead of deploying its own.

Changes:
- Add minio.enabled condition (default false) to control subchart deployment
- Add minio.external config for pointing to external minio instance
- Update _env.yaml to handle both deployed and external minio
- Update openhands parent chart to configure automation with external minio

Deployment modes:
1. Standalone: Set minio.enabled=true to deploy minio as a subchart
2. Subchart of openhands: Set minio.enabled=false and use minio.external

Co-authored-by: openhands <openhands@all-hands.dev>

@jlav jlav left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Those updates look great 🎉

@malhotra5 malhotra5 merged commit 15af6e8 into main Mar 26, 2026
18 of 19 checks passed
@malhotra5 malhotra5 deleted the add-automation-chart branch March 26, 2026 16:28
jlav added a commit that referenced this pull request Mar 26, 2026
The automation chart was added in #464 but was missing its
Replicated HelmChart manifest, causing the linter to fail with
"Could not find helm chart manifest for archive automation-0.1.0.tgz".
jlav added a commit that referenced this pull request Mar 26, 2026
The automation chart was added in #464 but was missing its
Replicated HelmChart manifest, causing the linter to fail with
"Could not find helm chart manifest for archive automation-0.1.0.tgz".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants