Skip to content

Conversation

@eberrigan
Copy link
Collaborator

Summary

Simplifies DNS and SSL configuration by removing redundant fields and adding comprehensive validation. This addresses security vulnerability CVE-pending (subdomain takeover via dangling DNS records).

OpenSpec Change: simplify-domain-ssl-config

Changes

DNS Configuration Simplification

  • REMOVED: app_name, pattern, custom_subdomain, create_zone (redundant fields)
  • KEPT: enabled, terraform_managed, domain, zone_id
  • domain now accepts full domain name directly (supports sub-subdomains for environment separation)

SSL Configuration Enhancement

  • ADDED: certificate_arn field for AWS Certificate Manager (ACM) support
  • ADDED: "acm" as SSL provider option (for enterprise SSL with ALB)
  • REMOVED: staging field (Let's Encrypt defaults to production)
  • Providers: "none", "letsencrypt", "cloudflare", "acm"

Configuration Validation

Added comprehensive pre-deployment validation rules:

  • SSL requires DNS to be enabled
  • DNS enabled requires non-empty domain field
  • Domain format validation (no leading/trailing dots)
  • Let's Encrypt requires email address
  • ACM requires certificate_arn
  • CloudFlare SSL requires terraform_managed=false

FQDN Environment Variable Priority

Updated get_allocator_url() to prioritize ALLOCATOR_FQDN environment variable (set by Terraform) over config-based URL construction. This provides a single source of truth for the allocator URL.

Test Coverage

  • Added 9 new validation tests covering all invalid configuration combinations
  • All 21 tests passing (12 existing + 9 new)
  • Coverage includes domain format validation, SSL/DNS dependencies, provider-specific requirements

Breaking Changes

⚠️ BREAKING: Removes DNS fields (app_name, pattern, custom_subdomain, create_zone)
⚠️ BREAKING: Removes SSL staging field

Migration Guide

# Before
dns:
  enabled: true
  app_name: "lablink"
  pattern: "subdomain"  # or "none"
  custom_subdomain: "test"
  domain: "sleap.ai"
  create_zone: false

# After
dns:
  enabled: true
  terraform_managed: false  # was inferred from create_zone=false
  domain: "test.lablink.sleap.ai"  # full domain

Next Steps

Corresponding changes required in lablink-template repository:

  • Update Terraform DNS logic (remove pattern-based domain construction)
  • Add FQDN computation and environment variable passing
  • Update SSL/Caddy configuration
  • Add conditional ACM/ALB support

Test Plan

  • All unit tests passing
  • Validation tests cover all invalid configurations
  • Domain format validation (leading/trailing dots)
  • SSL/DNS dependency validation
  • Provider-specific validation (Let's Encrypt, ACM, CloudFlare)
  • Integration test with lablink-template changes

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

eberrigan and others added 3 commits November 4, 2025 16:17
Breaking changes to DNS and SSL configuration schema:

**DNS Configuration:**
- Remove redundant fields: app_name, pattern, custom_subdomain, create_zone
- Use single `domain` field for full domain (supports sub-subdomains)
- Simplify to: enabled, terraform_managed, domain, zone_id

**SSL Configuration:**
- Add ACM provider support (enterprise SSL via AWS Certificate Manager)
- Remove staging field (use dns.enabled=false for testing)
- Add certificate_arn field for ACM
- Providers: none, letsencrypt, cloudflare, acm

**FQDN Single Source of Truth:**
- Add ALLOCATOR_FQDN environment variable support
- Terraform computes FQDN and passes to allocator container
- Fixes issue #212 (dual source of truth between Terraform and allocator)

**Enhanced Validation:**
- SSL requires DNS enabled
- DNS enabled requires non-empty domain
- Domain cannot start/end with dots (fixes .lablink.sleap.ai bug)
- Provider-specific validation (email for Let's Encrypt, ARN for ACM)
- CloudFlare requires external DNS (terraform_managed=false)

**Tests:**
- Add 9 new validation tests covering all rules
- Update test fixtures to new schema
- All 21 validation tests passing

Addresses:
- Security disclosure: Subdomain takeover via dangling DNS records
- #200 (Subdomain bug persists)
- #212 (FQDN environment variable)
- talmolab/lablink-template#7 (Simplify DNS and SSL)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Add OpenSpec framework for managing architectural changes:
- OpenSpec instructions block in CLAUDE.md
- AGENTS.md with workflow documentation
- Change proposal for simplified DNS/SSL configuration
- Project context documentation

This establishes a structured process for proposing and tracking
breaking changes like the DNS/SSL configuration simplification.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Mark all completed lablink repo implementation tasks (sections 1-4):
- Configuration schema updates (DNS and SSL)
- Validation logic enhancement
- Allocator FQDN support
- All tests passing (21/21)

Sections 5-9 marked as future work (docs, release, lablink-template)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
eberrigan added a commit to talmolab/lablink-template that referenced this pull request Nov 13, 2025
This implements the approved OpenSpec change to simplify DNS and SSL
configuration, add ACM support, and fix security vulnerabilities.

Breaking Changes:
- Removed dns.app_name, dns.pattern, dns.custom_subdomain, dns.create_zone
- Removed ssl.staging
- dns.domain now accepts full domain (e.g., "test.lablink.example.com")
- ssl.provider now supports "acm" for AWS Certificate Manager

Infrastructure Changes:
- Updated main.tf to remove pattern-based DNS logic
- Added ALLOCATOR_FQDN computation and environment variable
- Created alb.tf for ACM/ALB support (conditional)
- Updated user_data.sh with conditional Caddy installation
- Added lifecycle hooks to Route53 records

Configuration Changes:
- Updated config.yaml with new schema
- Updated test.example.yaml with new schema
- Created 5 canonical use case examples:
  - ip-only.example.yaml (no DNS, no SSL)
  - cloudflare.example.yaml (CloudFlare DNS + SSL)
  - letsencrypt.example.yaml (Route53 + Let's Encrypt, Terraform-managed)
  - acm.example.yaml (Route53 + ACM via ALB)
  - letsencrypt-manual.example.yaml (Route53 + Let's Encrypt, manual DNS)

OpenSpec:
- Created proposal in openspec/changes/implement-simplified-dns-ssl/
- Added infrastructure spec deltas
- Created detailed tasks checklist

Related: talmolab/lablink#230

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
eberrigan added a commit to talmolab/lablink-template that referenced this pull request Nov 25, 2025
Updates all example config files and deployment workflow to use the new
simplified DNS/SSL schema.

Config Files Updated:
- ci-test.example.yaml: Removed app_name, pattern, custom_subdomain, create_zone, staging
- dev.example.yaml: Removed app_name, pattern, custom_subdomain, create_zone, staging
- example.config.yaml: Removed app_name, pattern, custom_subdomain, create_zone, staging
- prod.example.yaml: Removed app_name, pattern, custom_subdomain, create_zone, staging

All configs now use:
- dns.domain: Full domain name directly (no pattern construction)
- ssl.certificate_arn: For ACM support
- No ssl.staging: Let's Encrypt always uses production

Workflow Updated:
- terraform-deploy.yml: Removed pattern and custom_subdomain extraction
- Now only displays full domain when DNS enabled

Related: #17, talmolab/lablink#230

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
eberrigan added a commit to talmolab/lablink-template that referenced this pull request Nov 25, 2025
Removes old schema fields (app_name, pattern, custom_subdomain, staging)
from configuration reference documentation.

Changes:
- DNS: Now shows full domain specification (no pattern construction)
- SSL: Added ACM provider, removed staging field
- Clarified that domain is used exactly as specified

Related: #17, talmolab/lablink#230

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
eberrigan added a commit to talmolab/lablink-template that referenced this pull request Nov 25, 2025
Updates DEPLOYMENT_CHECKLIST.md and lablink-infrastructure/README.md
to reflect the new SSL configuration without staging mode.

Changes:
- Removed all references to ssl.staging field
- Updated SSL provider documentation to include ACM
- Clarified that Let's Encrypt always uses production certs
- Updated rate limit information (50 certs/week per domain)

This completes the removal of all old schema fields from documentation.

Related: #17, talmolab/lablink#230

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants