Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .claude/rules/downstream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
description: Architectural context and guidelines for the downstream cluster example.
globs: ["examples/downstream/**/*"]
---
# Downstream Example Rules

## Context Loading
When you are asked to troubleshoot, refactor, or add features to the downstream example, you MUST use your file-reading tools to read the following files into context before proceeding:
- `examples/downstream/main.tf` (to understand the current node configs and provider setups)
- `examples/downstream/modules/deploy/variables.tf` (to understand the inputs expected by the deployment module)
- `examples/downstream/modules/downstream_securitygroups/variables.tf` (to understand the inputs expected by the security group module)
- `examples/downstream/modules/downstream/variables.tf` (to understand the inputs expected by the downstream module)
- `examples/downstream/downstream/variables.tf` (to understand the inputs expected by the downstream module deployment)

When modifying files in the `examples/downstream` directory, strictly adhere to the following architectural guidelines, developer paradigms, and operational flows.

## Developer Paradigms
- **Local Modules (LMod)**: The subdirectories (`modules/`) are not independent; they act like function calls integral to the orchestration. **Never nest Local Modules inside one another.**
- **Highly Opinionated Selectors**: Use the `configs` local block in the root `main.tf` as a feature selector. Do not expose all Kubernetes parameters to the user; instead, rely on selecting a predefined architecture (like `prod-node-config` or `split-role-node-config`).
- **All Variables in Locals**: Map variables in `main.tf` immediately to a `locals` block. Resources must only reference these `locals` to isolate variable transformations.

## Execution Flow
When refactoring or adding features, ensure you respect and maintain the established execution flow:
1. **Upstream Deployment**: The root `main.tf` triggers the deployment of the parent module, provisioning an RKE2 cluster and installing Rancher.
2. **Rancher Authentication**: `rancher2_bootstrap` grabs the admin token and configures the default `rancher2` provider once the UI is available.
3. **Downstream Security**: `modules/downstream_securitygroups` maps network rules allowing the authenticated Rancher server to communicate with future downstream nodes.
4. **Downstream Networking**: `modules/downstream` establishes a private subnet and a NAT gateway for isolated node provisioning.
5. **Downstream Provisioning**: `modules/downstream` talks to the Rancher API to create Machine Configs (EC2 templates) and a new RKE2 Cluster definition.
6. **State Syncing**: `rancher2_cluster_sync` blocks Terraform execution until the newly provisioned downstream cluster achieves an active state.

## Directory Structure Responsibilities

### `examples/downstream/` (Root Implementation Module)
- Keep logic focused on setting up the upstream cluster, provider authentication, and delegating to local modules.
- Outputs should output sensitive connection and state data generated from the upstream cluster (kubeconfig, tokens, etc).

### `examples/downstream/modules/downstream_securitygroups/`
- Exclusively manage network boundaries and access rules (ingress/egress) for the downstream cluster.
- Ensure traffic is allowed between the downstream cluster, the load balancer, and the upstream Rancher cluster's security group.

### `examples/downstream/modules/downstream/`
- Manage downstream networking (subnets/Route Tables/NAT) to ensure downstream nodes are isolated from the public internet.
- Dynamically provision Machine Configs and map node roles (control plane, etcd, worker).
- Use `terraform_data` provisioners to execute credential patching via `addKeyToAmazonConfig.sh`.
- **Do not expose direct SSH outputs** since nodes reside in a private subnet.
57 changes: 57 additions & 0 deletions .claude/rules/terraform.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
paths:
- "**/*.tf"
---
# Terraform Rules

As an AI Agent operating in this repository, you MUST strictly adhere to the following Terraform coding standards. Do not deviate from these rules under any circumstances.

## 1. Syntax & Formatting Constraints

* **Attribute Order**: You MUST declare resource attributes in this exact top-down order to ensure consistency:
1. `count`
2. `depends_on`
3. `for_each`
4. `source`
5. `version`
6. `triggers`
7. *All other attributes*
* **Explicit Dependencies**: You MUST always explicitly state `depends_on` blocks for resources and modules, even if Terraform can infer the dependency graph natively.
* **Ternary Operations**: You MUST wrap all ternary operations in parentheses.
* *Correct*: `attribute = (var.is_enabled ? true : false)`
* *Incorrect*: `attribute = var.is_enabled ? true : false`
* **Embedded Scripts**: Avoid embedded scripts if possible (use `file()` or `templatefile()`). If embedding is required, you MUST use heredoc syntax (`<<-EOT`).

## 2. Variables & Locals (Strict Mapping)

* **Locals Mapping**: ALL variables (`var.*`) MUST be immediately mapped to a `locals {}` block in the root of the module (usually `main.tf`).
* **Resource Referencing**: Resources MUST ONLY reference `local.*`. You MUST NEVER reference `var.*` directly inside a `resource` or `module` block.

## 3. Count vs. Iteration

* **Count as a Feature Flag**: You MUST ONLY use `count` as a boolean feature flag to turn a resource on or off (`0` or `1`).
* *Correct*: `count = (local.create_resource ? 1 : 0)`
* **Never Iterate with Count**: You MUST NEVER use `count` to iterate over lists and create multiple instances of a resource. This causes cascading dependency destructions when list orders change. Use `for_each` instead.

## 4. Module Paradigms & Hierarchies

Understand the distinction between XMod (External), LMod (Local), and IMod (Implementation) modules.

* **No Nesting Local Modules**: You MUST NEVER nest an LMod (Local Module) inside another LMod. Treat LMods like function calls orchestrated by the Implementation Module (IMod).
* **Module Tiers (Max 3 Levels)**:
* **Core Modules**: Call only resources. NEVER call other modules.
* **Primary Modules**: Call only Core Modules (exceptions allowed for `local_file`, `random`, or `terraform_data`). NEVER call raw API resources.
* **Secondary Modules**: Call only Primary Modules. Represents large systems.
* **Highly Opinionated Selectors**: Favor providing pre-defined configurations in `locals` (e.g., `prod-node-config`) rather than exposing raw, granular resource parameters via variables.

## 5. Provisioners & SSH Access

* **Script Paths**: When using `remote-exec` or connection strings, you MUST ALWAYS explicitly set the `script_path` attribute to avoid SELinux execution blocks in `/tmp`.
* **SSH Agent Only**: Modules MUST NOT generate or accept private SSH keys or passwords as variables unless strictly necessary for a specific cloud-init sequence. Assume the user relies on a local SSH agent.

## 6. Testing Terminology

When writing tests, adhere to these conceptual boundaries:
* **Unit Test**: Tests a single Local Module (LMod) in isolation.
* **Integration Test**: Tests the interaction between two or more LMods.
* **E2E Test**: Tests the entire Implementation Module (IMod) with real provider interactions.
4 changes: 4 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
You are GitHub Copilot, acting as the PR Code Reviewer for the terraform-rancher2-aws repository.
Before reviewing any Pull Requests, generating code, or answering chat queries, you MUST read `AGENTS.md` located at the root of the workspace.
AGENTS.md contains your specific persona instructions (to enforce rules without bikeshedding) and points to our strict Terraform paradigms.
Do NOT proceed with your review until you have read and adopted the Copilot persona from `AGENTS.md`.
40 changes: 40 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# AI Agent Instructions for terraform-rancher2-aws

Hello! As an AI assistant working in this repository, please adhere to the following guidelines to ensure your code suggestions align with our project standards.

## 1. Agent Roles & Personas
Depending on which AI tool is reading this, your expected behavior differs based on the developer's workflow:
- **GitHub Copilot**: You act primarily as a **PR Code Reviewer**. Your job is to rigorously review code changes against `terraform.md` paradigms and catch structural/formatting issues. Do NOT suggest unnecessary changes or engage in pedantic "bikeshedding"; focus only on strictly enforced rules.
- **Gemini Code Assist**: You act as a **Conversational Coding Partner**. Your job is to help the developer think through problems, explain complex Terraform architectures, and guide the implementation interactively.
- **Claude**: You act as an **Autonomous Agent**. Your job is to quickly and efficiently execute tasks, fix PR review feedback generated by Copilot, and handle necessary coding chores with minimal conversational overhead.

## 2. Core Documentation
Before making architectural or structural changes, you must review:
- `.claude/rules/terraform.md` - This is our absolute source of truth for Terraform paradigms, attribute ordering, and module structuring.
- `README.md` - For general deployment and testing requirements.

## 3. AI Workspace Structure (.claude)
We have adopted the Claude CLI `.claude` directory structure as the universal standard for organizing AI context, regardless of which agent you are. Whenever you need to read or create AI-specific instructions, adhere to this layout:
- **`CLAUDE.md` / `AGENTS.md`**: Project-level core instructions (this file).
- **`.claude/rules/*.md`**: Topic-scoped instructions (e.g., Terraform guidelines, Go testing rules).
- **`.claude/settings.json`**: Permissions, hooks, environment variables.
- **`.claude/skills/<name>/SKILL.md`**: Reusable prompts and context loading skills.
- **`.claude/commands/*.md`**: Single-file prompts.
- **`.claude/agents/*.md`**: Subagent definitions.
- **`.claude/output-styles/*.md`**: Custom response formatting.

## 4. Key Coding Paradigms (Strictly Enforced)
- **Attribute Order:** Resources must follow this exact order: `count`, `depends_on`, `for_each`, `source`, `version`, `triggers`, then everything else.
- **Variables in Locals:** All variables must be mapped to a `locals` block immediately. Resources must *only* reference `locals`, never `var.*` directly.
- **Count as a Feature Flag:** Do not use `count` as an iterator to provision multiple resources. Use it strictly as a boolean feature flag (`0` or `1`).
- **Module Tiers:** Understand the difference between Core, Primary, Secondary, and Implementation modules. **Never nest local modules!**
- **Explicit Dependencies:** Always explicitly state `depends_on` blocks, even if Terraform can infer the dependency graph organically.

## 5. Sub-Project Contexts
If you are asked to work in specific directories or examples, check the `.claude/skills/` directory for context loaders, or the local implementation documentation:
- Downstream deployments: Read `.claude/skills/load-downstream-context/SKILL.md`
- Production deployments: Read `examples/prod/README.md` (to be migrated to `.claude/skills/`)

## 6. Testing
When writing tests or evaluating infrastructure, remember that state files are kept local for example implementations.
Always advise developers to use the `cleanup.sh` and `run_tests.sh` scripts to handle leftover AWS resources and ensure an idempotent testing environment.
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
You are an AI assistant working on the terraform-rancher2-aws project.
Before making any code suggestions or analyzing the repo, you MUST read `AGENTS.md` in the root of this repository for your complete instructions.
3 changes: 3 additions & 0 deletions GEMINI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Gemini System Prompt

Before executing any commands, analyzing the repository, or generating code, you MUST read the `AGENTS.md` file located in the root of this repository for your complete instructions.
2 changes: 1 addition & 1 deletion GNUmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ fmt:
cd test/tests; gofmt -s -w -e .; cd ../..

lint:
tflint --recursive; \
tflint --recursive --fix; \
cd test/tests; golangci-lint run; cd ../..; \
actionlint

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ If you are using remote state files and would like to be able to pass a backend

#### Paradigms and Expectations

Please make sure to read [terraform.md](./terraform.md) to understand the paradigms and expectations that this module has for development.
Please make sure to read .claude/rules/terraform.md to understand the paradigms and expectations that this module has for development.

#### Environment

Expand Down
89 changes: 89 additions & 0 deletions examples/downstream/downstream/main.tf.tftpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
provider "aws" {
region = local.aws_region
default_tags {
tags = {
Id = local.identifier
Owner = local.owner
}
}
}

locals {
aws_region = base64decode(var.aws_region)
identifier = base64decode(var.identifier)
owner = base64decode(var.owner)
rancher_address = base64decode(var.rancher_address)
rancher_admin_password = base64decode(var.rancher_admin_password)
rancher_admin_token = base64decode(var.rancher_admin_token)
tls_certificate_chain = <<-EOT
${base64decode(var.tls_certificate_chain)}
EOT
node_config_name = base64decode(var.node_config_name)
aws_access_key_id = base64decode(var.aws_access_key_id)
aws_secret_access_key = base64decode(var.aws_secret_access_key)
aws_session_token = base64decode(var.aws_session_token)
aws_region_letter = base64decode(var.aws_region_letter)
downstream_security_group_name = base64decode(var.downstream_security_group_name)
vpc_id = base64decode(var.vpc_id)
load_balancer_security_group_id = base64decode(var.load_balancer_security_group_id)
subnet_id = base64decode(var.subnet_id)
node_info = jsondecode(base64decode(var.node_info))
runner_ip = base64decode(var.runner_ip)
ssh_access_key = base64decode(var.ssh_access_key)
ssh_access_user = base64decode(var.ssh_access_user)
rke2_version = base64decode(var.rke2_version)
}

data "external" "login" {
program = ["bash", "./modules/downstream/login.sh"]
query = {
api_url = local.rancher_address
admin_password = local.rancher_admin_password
admin_token = local.rancher_admin_token
}
}

provider "rancher2" {
api_url = local.rancher_address
token_key = data.external.login.result.admin_token
ca_certs = local.tls_certificate_chain
timeout = "300s"
}

data "rancher2_cluster" "local" {
depends_on = [
data.external.login
]
name = "local"
}

module "downstream" {
depends_on = [
data.rancher2_cluster.local
]
#
source = "./modules/downstream"

name = local.node_config_name
identifier = local.identifier
owner = local.owner

aws_access_key_id = local.aws_access_key_id
aws_secret_access_key = local.aws_secret_access_key
aws_session_token = local.aws_session_token
aws_region = local.aws_region
aws_region_letter = local.aws_region_letter
downstream_security_group_name = local.downstream_security_group_name

vpc_id = local.vpc_id
load_balancer_security_group_id = local.load_balancer_security_group_id
subnet_id = local.subnet_id

node_info = local.node_info
direct_node_access = {
runner_ip = local.runner_ip
ssh_access_key = local.ssh_access_key
ssh_access_user = local.ssh_access_user
}
rke2_version = local.rke2_version
}
4 changes: 4 additions & 0 deletions examples/downstream/downstream/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
output "cluster_data" {
value = jsonencode(data.rancher2_cluster.local)
sensitive = true
}
90 changes: 90 additions & 0 deletions examples/downstream/downstream/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# tflint-ignore: terraform_unused_declarations
variable "aws_region" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "identifier" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "owner" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "rancher_address" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "rancher_admin_password" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "rancher_admin_token" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "tls_certificate_chain" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "node_config_name" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "aws_access_key_id" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "aws_secret_access_key" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "aws_session_token" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "aws_region_letter" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "downstream_security_group_name" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "vpc_id" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "load_balancer_security_group_id" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "subnet_id" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "node_info" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "runner_ip" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "ssh_access_key" {
type = string
sensitive = true
}
# tflint-ignore: terraform_unused_declarations
variable "ssh_access_user" {
type = string
}
# tflint-ignore: terraform_unused_declarations
variable "rke2_version" {
type = string
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ terraform {
}
rancher2 = {
source = "rancher/rancher2"
version = ">= 5.0.0"
version = ">= 14.0.0"
}
time = {
source = "hashicorp/time"
version = ">= 0.13.1"
external = {
source = "hashicorp/external"
version = ">= 2.4.0"
}
}
}
Loading
Loading