A production-ready k3s Terraform module for the OCI Always Free tier.
- HA control plane — 3 control-plane nodes with embedded etcd; survives 1 node failure
- Full stack always deployed — cert-manager, Longhorn, ArgoCD + Image Updater, and kured are always installed; they keep the cluster active and prevent idle reclamation
- Separate public/private subnets — k3s nodes have no public IP; only LBs and the optional bastion are internet-facing
- Envoy Gateway ingress (Gateway API) — DaemonSet with
system-cluster-criticalpriority andPodDisruptionBudget maxUnavailable: 1; standardHTTPRoute/Gatewayresources; real client IP preservation via NLB transparent mode - Automatic security updates —
unattended-upgrades+ kured drain-reboot-uncordon cycle; zero manual intervention - k3s version pinned at plan time — resolved from the GitHub API during
terraform plan, not at boot time - Cluster-scoped IAM — dynamic group and policy scoped to nodes tagged with the cluster name, not every instance in the compartment
- Idempotent cloud-init — all
kubectloperations useapply; re-provisioning is safe - OCI Vault (
enable_vault = true) — cluster secrets in a free software-protected OCI Vault; fetched at boot via instance_principal, not embedded in user-data - Boot volume backups (
enable_backup = true) — weekly full backups, 1-week retention, within the 5-backup Always Free limit - Object Storage state bucket (
enable_object_storage_state = true) — versioned OCI Object Storage for Terraform state; S3-compatible endpoint interraform_state_backendoutput - OCI Notifications + Alertmanager (
enable_notifications = false) — opt-in OCI Notifications topic wired to Alertmanager as a webhook receiver - MySQL HeatWave (
enable_mysql = false) — opt-in Always Free MySQL DB in the private subnet; credentials pre-created as a Kubernetes Secret - External DNS (
enable_external_dns = false) — automatic Cloudflare DNS record management from HTTPRoute hostnames - External Secrets (
enable_external_secrets = false) — sync OCI Vault secrets into Kubernetes Secrets via instance_principal; no credentials to rotate
graph TD
Internet(["🌐 Internet"])
subgraph public["Public Subnet · 10.0.0.0/24"]
NLB["🔀 Public NLB (Always Free)\nHTTP :80 · HTTPS :443\noptional: kubeapi :6443"]
end
subgraph private["Private Subnet · 10.0.1.0/24 · no public IPs"]
ILB["⚖️ Internal Flex LB (Always Free)\nkubeapi VIP :6443"]
subgraph cp["Control Plane × 3 · A1.Flex (1 OCPU / 6 GB each)\nk3s-server · etcd · Envoy Gateway · Longhorn · user workloads"]
CP0["control-plane-0"]
CP1["control-plane-1"]
CP2["control-plane-2"]
end
W["worker-0 · A1.Flex (1 OCPU / 6 GB)\nk3s-agent · Envoy Gateway · Longhorn · user workloads"]
end
NAT["🌍 NAT Gateway (Always Free)"]
Bastion["🔐 OCI Bastion Service\noptional · Always Free"]
Internet -->|HTTP / HTTPS| NLB
NLB -->|"Envoy Gateway NodePorts :30080 / :30443"| CP0 & CP1 & CP2 & W
NLB -. "kubeapi :6443\nexpose_kubeapi=true" .-> ILB
ILB --> CP0 & CP1 & CP2
W -->|joins via kubeapi| ILB
private -->|outbound| NAT --> Internet
Bastion -. "SSH tunnel\nenable_bastion=true" .-> private
All four A1.Flex instances live in a private subnet with no public IPs. Internet traffic enters exclusively through two Always Free load balancers.
k3s naming note: k3s calls control-plane nodes "servers" (
k3s server) and workers "agents" (k3s agent). Terraform resources follow k3s conventions (server/worker); in standard Kubernetes terminology these map to control-plane and worker nodes.
Public NLB forwards HTTP/HTTPS directly to Envoy Gateway NodePorts on all four nodes. is_preserve_source = true preserves real client IPs at the hypervisor level. The NLB optionally exposes the Kubernetes API on port 6443, restricted to your IP.
Internal Flex LB provides a stable private VIP across all three control-plane nodes. Workers join via this VIP so the cluster survives any single control-plane loss.
Longhorn runs on all four nodes with defaultReplicaCount=3 — each PVC is replicated across three nodes. Control-plane NoSchedule taints are removed after cluster init so user workloads schedule across all four identically-sized nodes.
HA ceiling: etcd runs on the 3 control-plane nodes (quorum = 2). The cluster tolerates 1 control-plane failure — the hard limit of a 4-node Always Free topology.
| Resource | Free allowance | This module |
|---|---|---|
| A1.Flex compute | 4 OCPUs / 24 GB / 4 instances | 3 servers + 1 worker = 4 OCPUs / 24 GB |
| Block storage | 200 GB | 4 × 50 GB = 200 GB |
| Network Load Balancer | 1 NLB | 1 (public, HTTP/HTTPS) |
| Flexible Load Balancer | 2 × 10 Mbps | 1 (private, kubeapi) |
| E2.1.Micro instances | 2 | 0 (bastion uses OCI Bastion Service — managed, no VM) |
| NAT Gateway | 1 per VCN | 1 (outbound-only for private nodes) |
| Object Storage | 20 GB | 2 versioned buckets — Terraform state + Longhorn PVC backups (enable_object_storage_state, enable_longhorn_backup) |
| Vault (shared) | Software keys + 150 secrets | 3 secrets — k3s_token, longhorn_ui_password, grafana_admin_password (enable_vault = true) |
| Volume backups | 5 total | 4 — one per node, weekly, 1-week retention (enable_backup = true) |
| Notifications | 1M HTTPS + 3K email/month | 1 topic wired to Alertmanager (enable_notifications = false, opt-in) |
| MySQL HeatWave | 1 standalone DB, 50 GB | 1 DB system in private subnet (enable_mysql = false, opt-in) |
⚠️ Idle reclamation : OCI reclaims Always Free instances where CPU, network, and memory stay below 20% for 7 consecutive days. The full stack (Longhorn, ArgoCD, cert-manager, kured) generates enough background activity to keep the cluster alive.
With a hard cap of 4 A1.Flex instances, the binding constraint is etcd quorum: HA etcd needs at minimum 3 nodes (quorum = ⌊n/2⌋+1 = 2). The result is a 3-server HA cluster plus 1 standalone worker that saturates every Always Free resource class with nothing left unused and nothing that costs money.
| Topology | etcd HA | Nodes for workloads | Effective RAM for workloads† | Assessment |
|---|---|---|---|---|
| 3 CP + 1 worker (this module) | ✅ 1-node fault | 4 (taints removed) | ~15 GB | Optimal — HA etcd, all 4 nodes contribute to workloads |
| 1 CP + 3 workers | ❌ CP is total SPOF | 4 | ~18 GB | More capacity but control-plane loss = complete cluster death |
| 2 CP + 2 workers | ❌ Invalid | — | — | 2-node etcd cannot form quorum; worse than 1 node |
| 4 CP + 0 workers | ✅ 1-node fault | 4 (taints removed) | ~12 GB | Fewer resources for workloads; more etcd overhead |
†etcd + kubeapi consume ~300–500 MB RAM and ~100–200m CPU per control-plane node.
4 × 1 OCPU even split prevents any single etcd node from becoming a hot-spot, creates 4 equal fault domains, and allows workloads to spread evenly.
Always Free also includes 2 AMD E2.1.Micro instances. They are not worth adding:
- Storage budget exhausted — 4 × 50 GB boot volumes already consume the full 200 GB Always Free block storage allowance; two additional instances would require at least 100 GB more
- 1 GB RAM — k3s agent + Longhorn DaemonSet alone consume ~700–800 MB, leaving ~200 MB for user workloads
- 1/8 OCPU — negligible compute; adds operational complexity for near-zero workload benefit
| Alternative | Why it was rejected |
|---|---|
| nginx stream proxy in front of Envoy Gateway | Extra latency and complexity; NLB already preserves source IPs directly |
| OCI Bastion VM (E2.1.Micro) | OCI Bastion Service provides managed SSH proxying for free with no VM, no OS to patch, and no boot volume consuming storage budget |
| Boot volumes < 50 GB | OCI hard minimum is 50 GB per shape; 4 × 50 GB = 200 GB exactly exhausts the free block storage allowance |
| Additional NLB for kubeapi | Only 1 NLB is Always Free; the existing NLB conditionally exposes port 6443 via expose_kubeapi = true |
| Component | Tolerance | What happens on failure |
|---|---|---|
| Any single node (any role) | ✅ 1 node | Workloads reschedule to remaining 3 nodes; Longhorn (3 replicas) keeps storage up; Envoy Gateway DaemonSet keeps ingress up on remaining nodes |
| 2 nodes simultaneously | Workloads and ingress continue on 2 surviving nodes; if both failed nodes are control-planes, etcd quorum is lost and the API server stops accepting writes (running pods keep running, no new scheduling) | |
| etcd / control-plane quorum | ❌ 2 control-planes | Cluster becomes read-only; recovery requires etcd snapshot restore |
| Worker node | ✅ Full | With taints removed, workloads reschedule to control-planes; no SPOF |
| HTTP/HTTPS ingress | ✅ 3 node losses | Envoy Gateway DaemonSet; NLB health-checks remove unhealthy backends automatically |
| Kubernetes API | ✅ 1 control-plane | ILB routes to remaining 2 control-planes |
| PVC data (Longhorn) | ✅ 1 node | 3 replicas across 4 nodes; 1 replica lost, 2 remain serving |
| cert-manager | Pod reschedules within minutes; TLS serving unaffected (certs live in Secrets); only new issuance/renewal is paused | |
| ArgoCD | GitOps sync pauses until rescheduled; running workloads unaffected | |
| MySQL (if enabled) | ❌ None | Always Free tier = single OCI-managed instance; no HA failover |
Each A1.Flex instance has identical resources (1 OCPU / 6 GB RAM). The k3s role (server vs agent) affects which system processes run, not how much resource is available for workloads.
| What | control-plane-0/1/2 | worker-0 | Scheduling mechanism |
|---|---|---|---|
| etcd | ✅ | ❌ | k3s built-in; servers only |
| Kubernetes API server | ✅ | ❌ | k3s built-in; servers only |
| Envoy Gateway (ingress) | ✅ | ✅ | DaemonSet — 1 pod per node |
| Longhorn (storage daemon) | ✅ | ✅ | DaemonSet — 1 pod per node |
| cert-manager | ✅ | ✅ | Deployment — schedules on any node |
| ArgoCD | ✅ | ✅ | Deployment — schedules on any node |
| kube-prometheus-stack | ✅ | ✅ | Deployment/StatefulSet — any node |
| kured | ✅ | ✅ | DaemonSet — 1 pod per node |
| User workloads | ✅ | ✅ | No restrictions — schedules on all 4 nodes |
Why control-planes run user workloads: k3s ≥ 1.24 automatically taints control-plane nodes with
NoSchedule. This setup removes those taints at cluster init so all 4 identically-sized nodes are available. With only one worker, keeping the taint would make it a single point of failure for all user workloads.Recommendation: use
replicas ≥ 2withtopologySpreadConstraints(see gitops/README.md) to spread pods across nodes and survive any single-node failure.
# 1. Clone and enter the example directory
git clone https://github.com/mbologna/k3s-oci.git
cd k3s-oci/example
# 2. Copy and edit the variables file
cp terraform.tfvars.example terraform.tfvars
$EDITOR terraform.tfvars
# 3. Init and apply (terraform or tofu both work)
terraform init && terraform apply
# tofu init && tofu applyAfter terraform apply, run:
terraform output kubeconfig_hintThis prints the exact steps for your configuration. If enable_bastion = true (recommended), the fastest path is the included helper script:
cd example && ./get-kubeconfig.sh
export KUBECONFIG=~/.kube/k3s-oci.yaml
kubectl get nodes
enable_bastiondefaults totrue. It uses OCI Bastion Service, a managed SSH proxy with no VM, no boot volume, and no cost. Without it, nodes are only reachable via OCI serial console (terraform output kubeconfig_hintexplains all options).
unattended-upgrades applies Ubuntu security patches daily and sets /var/run/reboot-required when a kernel update needs a reboot.
kured watches every node for /var/run/reboot-required and, when found:
- Acquires a cluster-wide lock (only one node reboots at a time)
- Cordons + drains the node
- Reboots
- Waits for the node to return and uncordons it
This keeps the cluster fully patched with zero manual intervention and no concurrent downtime.
The gitops/ directory contains ArgoCD Application manifests managed with the App of Apps pattern.
After the cluster is running, bootstrap it:
kubectl apply -n argocd -f gitops/apps/app-of-apps.yamlArgoCD will then continuously reconcile every manifest under gitops/apps/.
This repo is designed to be forked. To add your own apps on top of the built-in stack:
-
Fork this repo on GitHub.
-
Update all
repoURLreferences to point to your fork:bash gitops/update-repo-url.sh https://github.com/your-org/your-fork.git git add gitops/apps/ && git commit -m "chore: update gitops repoURL" git push
-
Add your ArgoCD
Applicationmanifests togitops/apps/— ArgoCD syncs them automatically. Each app can point at any Helm chart registry or any Git repository.
Deploying for the first time? Also set
gitops_repo_urlinterraform.tfvarsbefore runningtofu apply, so cloud-init writes the correct fork URL at bootstrap:gitops_repo_url = "https://github.com/your-org/your-fork.git"Already have a running cluster? Patch the App of Apps directly:
argocd app set app-of-apps --repo https://github.com/your-org/your-fork.git
Private repos: configure ArgoCD repository credentials (
argocd repo add) before adding manifests that pull from private repositories.
OCI provides two load balancer products with very different capabilities:
| OCI Network Load Balancer (NLB) | OCI Flexible Load Balancer | |
|---|---|---|
| OSI layer | L4 — TCP passthrough | L7 — HTTP/HTTPS aware |
| TLS termination | ❌ Not possible | ✅ Yes |
| Always Free | 1 NLB | 2 × 10 Mbps |
| Used here | nlb.tf — public internet traffic |
lb.tf — internal kubeapi HA VIP |
The public-facing load balancer is the NLB. It forwards raw TCP streams with protocol = "TCP" — it has no knowledge of TLS, HTTP headers, or certificates. TLS must be terminated by something behind it.
The Flexible LB could terminate TLS, but the one free allocation is already consumed by the kubeapi HA load balancer. Even if it were available, using OCI to manage certificates would break the automatic cert-manager + Let's Encrypt renewal cycle.
The current flow is: Internet → NLB (TCP passthrough, preserves client IPs) → Envoy Gateway NodePort → TLS terminate → route to app pod.
No domain needed. Requests to the NLB IP are served directly.
# hello-web.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-web
namespace: hello-web
spec:
replicas: 2
selector:
matchLabels:
app: hello-web
template:
metadata:
labels:
app: hello-web
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: hello-web
containers:
- name: hello-web
image: httpd:alpine
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: hello-web
namespace: hello-web
spec:
selector:
app: hello-web
ports:
- port: 80
targetPort: 80
---
# HTTPRoute — no hostname filter = matches all requests on the http listener
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: hello-web
namespace: hello-web
spec:
parentRefs:
- name: eg
namespace: envoy-gateway-system
sectionName: http
rules:
- backendRefs:
- name: hello-web
port: 80kubectl create namespace hello-web
kubectl apply -f hello-web.yaml
NLB_IP=$(cd example && tofu output -raw nlb_ip)
curl http://$NLB_IP/sslip.io is a public DNS service that resolves <anything>.<ip>.sslip.io directly to <ip>. Combined with cert-manager + Let's Encrypt HTTP-01, this gives a trusted TLS certificate with zero infrastructure cost.
Replace <NLB_IP> with the value of tofu output -raw nlb_ip.
# hello-web-tls.yaml
---
# 1. Certificate — cert-manager issues this via HTTP-01 challenge through Envoy Gateway
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: hello-web-tls
namespace: envoy-gateway-system # must be in the same namespace as the Gateway
spec:
secretName: hello-web-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- hello-web.<NLB_IP>.sslip.io
---
# 2. HTTPS listener on the Gateway (add this to gitops/gateway/gateway.yaml for GitOps management)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: eg
namespace: envoy-gateway-system
spec:
gatewayClassName: eg
listeners:
- name: http
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: All
- name: https-hello-web
port: 443
protocol: HTTPS
hostname: hello-web.<NLB_IP>.sslip.io
tls:
mode: Terminate
certificateRefs:
- name: hello-web-tls
allowedRoutes:
namespaces:
from: All
---
# 3. HTTP→HTTPS redirect (add hostname to gitops/gateway/redirect.yaml)
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: http-to-https-redirect
namespace: envoy-gateway-system
spec:
parentRefs:
- name: eg
sectionName: http
hostnames:
- hello-web.<NLB_IP>.sslip.io
rules:
- filters:
- type: RequestRedirect
requestRedirect:
scheme: https
statusCode: 301
---
# 4. HTTPRoute for the app — attaches to both listeners
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: hello-web
namespace: hello-web
spec:
parentRefs:
- name: eg
namespace: envoy-gateway-system
sectionName: https-hello-web
hostnames:
- hello-web.<NLB_IP>.sslip.io
rules:
- backendRefs:
- name: hello-web
port: 80# Wait for certificate issuance (typically 1–2 minutes)
kubectl wait --for=condition=Ready certificate/hello-web-tls -n envoy-gateway-system --timeout=5m
curl https://hello-web.<NLB_IP>.sslip.io/With a real domain: set
enable_external_dns = trueand annotate the HTTPRoute withexternal-dns.alpha.kubernetes.io/hostname: myapp.example.com. External DNS will create the A record automatically, then cert-manager issues the certificate. Alternatively, setenable_dns01_challenge = trueto use DNS-01 (supports wildcard certs and does not require inbound port 80).
Use topologySpreadConstraints to ensure pod replicas land on different nodes:
spec:
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: <your-app>With 4 identically-sized nodes, 2 replicas survive any single node failure. Envoy Gateway runs as a DaemonSet with maxUnavailable: 1, so ingress remains up on the other 3 nodes throughout any single-node drain or failure.
Renovate tracks Terraform providers, k3s, all stack component versions (via # renovate: inline comments in vars.tf and gitops/apps/*.yaml), and GitHub Actions. Enable with the Renovate GitHub App or the self-hosted workflow at .github/workflows/renovate.yml (requires a RENOVATE_TOKEN secret with repo scope).
With enable_object_storage_state = true (the default), a versioned OCI Object Storage bucket is created automatically. After terraform apply, get the ready-to-use backend config:
terraform output -json terraform_state_backendUse it in your terraform { backend "s3" {} } block (requires an OCI Customer Secret Key for S3 credentials):
terraform {
backend "s3" {
bucket = "<cluster_name>-terraform-state"
key = "terraform.tfstate"
region = "<your-region>" # e.g. eu-frankfurt-1
endpoint = "https://<namespace>.compat.objectstorage.<region>.oraclecloud.com"
skip_region_validation = true
skip_credentials_validation = true
skip_metadata_api_check = true
force_path_style = true
}
}Generate OCI Customer Secret Keys under Identity → Users → your user → Customer Secret Keys. The bucket name and namespace endpoint are in
terraform output terraform_state_backend.
MIT. See LICENSE.
| Name | Description | Type | Default | Required |
|---|---|---|---|---|
| alertmanager_email | Optional email address to subscribe to the OCI Notifications topic. The subscriber must confirm via an OCI confirmation email. | string |
null |
no |
| argocd_chart_version | ArgoCD Helm chart version used for the bootstrap install. Must match gitops/apps/argocd.yaml targetRevision. Managed by Renovate. | string |
"9.5.14" |
no |
| availability_domain | Availability domain name, e.g. 'Uocm:EU-FRANKFURT-1-AD-1' | string |
n/a | yes |
| boot_volume_size_in_gbs | Boot volume size in GB for k3s nodes (servers + workers). OCI minimum is 50 GB for all shapes. With 4 k3s nodes at 50 GB each the total is 200 GB (exactly at the Always Free limit). The bastion uses OCI Bastion Service — no VM, no boot volume. | number |
50 |
no |
| certmanager_chart_version | cert-manager Helm chart version used for the bootstrap install. Must match gitops/apps/cert-manager.yaml targetRevision. Managed by Renovate. | string |
"v1.20.2" |
no |
| certmanager_email_address | Email address for Let's Encrypt ACME registration. Must be a real address. | string |
n/a | yes |
| cloudflare_api_token | Cloudflare API token. Required when enable_external_dns = true or enable_dns01_challenge = true. Create a scoped token at https://dash.cloudflare.com/profile/api-tokens with Zone:DNS:Edit permissions. | string |
null |
no |
| cloudflare_zone_id | Cloudflare Zone ID for the managed domain. Required when enable_external_dns = true. | string |
null |
no |
| cluster_name | Logical name for the cluster. Used in display names and freeform tags. | string |
n/a | yes |
| compartment_ocid | OCID of the compartment where all resources are created | string |
n/a | yes |
| compute_shape | OCI compute shape for k3s nodes | string |
"VM.Standard.A1.Flex" |
no |
| dockerhub_password | Docker Hub access token (PAT) for ArgoCD OCI Helm chart pulls. Paired with dockerhub_username. | string |
"" |
no |
| dockerhub_username | Docker Hub username for ArgoCD to authenticate when pulling OCI Helm charts (e.g. Envoy Gateway from registry-1.docker.io). If empty, anonymous pulls are attempted and may be rate-limited. Create a PAT at https://app.docker.com/settings/personal-access-tokens | string |
"" |
no |
| enable_backup | Enable weekly boot volume backups for all k3s nodes (Always Free: 5 total backups). With 4 nodes at weekly-1-week-retention there are at most 4 active backups. | bool |
true |
no |
| enable_bastion | Provision an OCI Bastion Service resource (managed SSH proxy, Always Free, no storage). When enabled, a STANDARD bastion is created and associated with the private subnet. Use example/get-kubeconfig.sh to retrieve kubeconfig via a Bastion session. Strongly recommended; without it, nodes are reachable only via serial console. |
bool |
true |
no |
| enable_dns01_challenge | Configure cert-manager ClusterIssuers to use DNS-01 ACME challenge via Cloudflare instead of HTTP-01. Enables wildcard certificates (*.example.com) and works even without inbound port 80. Requires cloudflare_api_token. | bool |
false |
no |
| enable_external_dns | Deploy external-dns (kubernetes-sigs) configured for Cloudflare. Automatically creates/updates DNS A records when Services or Ingresses are annotated. Requires cloudflare_api_token and cloudflare_zone_id. | bool |
false |
no |
| enable_external_secrets | Deploy the External Secrets Operator and create a ClusterSecretStore backed by OCI Vault (instance_principal auth). Requires enable_vault = true. Workloads can then create ExternalSecret resources to sync any OCI Vault secret into a Kubernetes Secret without hard-coding values. | bool |
false |
no |
| enable_longhorn_backup | Provision a dedicated Always Free OCI Object Storage bucket for Longhorn PVC backups (S3-compatible). See longhorn_backup_setup output for connection instructions. Shares the 20 GB free allowance with the Terraform state bucket. | bool |
true |
no |
| enable_mysql | Provision an Always Free MySQL HeatWave DB system (single node, 50 GB). Creates a Kubernetes Secret 'mysql-credentials' in the default namespace. | bool |
false |
no |
| enable_notifications | Create an OCI Notifications topic and wire it to Alertmanager as a webhook receiver (Always Free: 1M HTTPS + 3K email/month). | bool |
false |
no |
| enable_object_storage_state | Provision an Always Free OCI Object Storage bucket for storing Terraform/OpenTofu state (S3-compatible API). See the terraform_state_backend output for the backend configuration snippet. | bool |
true |
no |
| enable_oci_logging | Enable OCI Logging for cloud-init logs. Ships /var/log/k3s-cloud-init.log to OCI Logging Service via the Unified Monitoring Agent (Always Free: 10 GB/month). | bool |
true |
no |
| enable_vault | Store cluster secrets (k3s_token, longhorn_ui_password, grafana_admin_password) in OCI Vault (Always Free: software keys + 150 secrets). Nodes fetch secrets via OCI CLI instance_principal at boot — plaintext values are removed from cloud-init user-data. | bool |
true |
no |
| environment | Deployment environment label (e.g. staging, production) | string |
"staging" |
no |
| expose_kubeapi | Expose the Kubernetes API server via the public NLB (restricted to my_public_ip_cidr) | bool |
false |
no |
| expose_ssh | Expose SSH (port 22) via the public NLB to all cluster nodes (restricted to my_public_ip_cidr). Eliminates the need for OCI Bastion sessions for day-to-day access. | bool |
false |
no |
| external_dns_domain_filter | Domain filter for external-dns — only DNS records under this domain are managed (e.g. 'k3s.example.com'). Required when enable_external_dns = true. | string |
null |
no |
| external_secrets_chart_version | External Secrets Operator Helm chart version used for the bootstrap install. Must match gitops/apps/external-secrets.yaml targetRevision. Managed by Renovate. | string |
"2.4.1" |
no |
| fault_domains | Fault domains to spread the instance pool across | list(string) |
[ |
no |
| gateway_api_version | Kubernetes Gateway API CRDs version (experimental channel) installed at bootstrap. Experimental channel is a superset of standard and includes GRPCRoute, TCPRoute, TLSRoute, etc. required by Envoy Gateway. Must exist before ArgoCD syncs gateway-config. | string |
"v1.5.1" |
no |
| github_ssh_keys_username | GitHub username whose published SSH keys (https://github.com/.keys) are added to every instance's authorized_keys at plan time, in addition to the primary public_key / public_key_path. Leave empty to skip. |
string |
"" |
no |
| gitops_repo_url | Git repository URL for the ArgoCD App of Apps (e.g. https://github.com/your-org/k3s-oci.git). Set this to your fork so ArgoCD pulls from the right repo. | string |
"https://github.com/mbologna/k3s-oci.git" |
no |
| grafana_hostname | Fully-qualified hostname for the Grafana UI (e.g. grafana.example.com). When set, a Gateway API HTTPRoute with a cert-manager TLS certificate is created in gitops/monitoring/. | string |
null |
no |
| http_lb_port | Public HTTP port on the NLB frontend (default 80). | number |
80 |
no |
| https_lb_port | Public HTTPS port on the NLB frontend (default 443). | number |
443 |
no |
| ingress_controller_http_nodeport | NodePort on workers that the ingress controller binds for HTTP traffic | number |
30080 |
no |
| ingress_controller_https_nodeport | NodePort on workers that the ingress controller binds for HTTPS traffic | number |
30443 |
no |
| k3s_server_pool_size | Number of k3s control-plane nodes in the instance pool. Use 3 for HA (etcd quorum). Must be an odd number >= 1. | number |
3 |
no |
| k3s_standalone_worker | When true (default), provisions one worker node as a plain oci_core_instance resource. This is the recommended approach for OCI Always Free tenancies: instance pools route requests through OCI Capacity Management which can fail for A1.Flex shapes, whereas a direct oci_core_instance reliably claims the free allocation. Default topology: 3 control-plane nodes (pool) + 1 standalone worker = 4 OCPUs / 24 GB. |
bool |
true |
no |
| k3s_subnet | Subnet name used to derive the flannel interface. Leave 'default_route_table' to let k3s auto-detect. | string |
"default_route_table" |
no |
| k3s_version | k3s version to install. 'latest' resolves the current stable release at plan time via the GitHub API. | string |
"latest" |
no |
| k3s_worker_pool_size | Number of k3s worker nodes managed by the OCI Instance Pool. Set to 0 (default) when using k3s_standalone_worker = true, which is the recommended Always Free topology. The pool is kept to allow future scaling beyond the free tier. |
number |
0 |
no |
| kube_api_port | Port the k3s API server listens on | number |
6443 |
no |
| longhorn_hostname | Fully-qualified hostname for the Longhorn UI (e.g. longhorn.example.com). When set, a Gateway API HTTPRoute with BasicAuth (Envoy Gateway SecurityPolicy) and a cert-manager TLS certificate is created. | string |
null |
no |
| longhorn_ui_username | Username for Longhorn UI BasicAuth (only used when longhorn_hostname is set). | string |
"admin" |
no |
| my_public_ip_cidr | Your workstation public IP in CIDR notation (e.g. 1.2.3.4/32). Restricts OCI Bastion Service session creation (enable_bastion = true) and kubeapi access via the public NLB (expose_kubeapi = true). k3s nodes are in a private subnet and are only reachable via OCI Bastion sessions. |
string |
n/a | yes |
| mysql_admin_username | Admin username for the MySQL HeatWave DB system. | string |
"admin" |
no |
| mysql_shape | MySQL HeatWave shape. 'MySQL.Free' is the Always Free shape. | string |
"MySQL.Free" |
no |
| oci_core_vcn_cidr | CIDR block for the VCN | string |
"10.0.0.0/16" |
no |
| oci_core_vcn_dns_label | n/a | string |
"k3svcn" |
no |
| oci_identity_dynamic_group_name | Name for the OCI dynamic group granting instances access to the OCI API | string |
"k3s-cluster-dynamic-group" |
no |
| oci_identity_policy_name | Name for the OCI IAM policy attached to the dynamic group | string |
"k3s-cluster-policy" |
no |
| os_image_id | OCID of the Ubuntu 24.04 LTS (Noble) aarch64 image for A1.Flex nodes. If null, the latest matching image is resolved automatically from the tenancy. Find OCIDs at https://docs.oracle.com/en-us/iaas/images/ | string |
null |
no |
| private_subnet_cidr | CIDR for the private subnet (k3s nodes) | string |
"10.0.1.0/24" |
no |
| private_subnet_dns_label | n/a | string |
"k3sprivate" |
no |
| public_key | SSH public key content placed on every instance. Preferred over public_key_path — pass the key string directly for CI pipelines where ~/.ssh does not exist. When null, the key is read from public_key_path at plan time. |
string |
null |
no |
| public_key_path | Path to SSH public key file. Used as fallback when public_key is null. | string |
"~/.ssh/id_ed25519.pub" |
no |
| public_subnet_cidr | CIDR for the public subnet (load balancers and optional bastion) | string |
"10.0.0.0/24" |
no |
| public_subnet_dns_label | n/a | string |
"k3spublic" |
no |
| region | OCI region identifier (e.g. 'eu-frankfurt-1'). Required when enable_external_secrets = true for the ClusterSecretStore to locate the OCI Vault endpoint. | string |
null |
no |
| server_memory_in_gbs | RAM in GB per control-plane node. Total RAM must not exceed 24 GB (Always Free). | number |
6 |
no |
| server_ocpus | OCPUs per control-plane node. Total OCPUs across all nodes must not exceed 4 (Always Free). | number |
1 |
no |
| tenancy_ocid | OCID of the tenancy | string |
n/a | yes |
| unique_tag_key | Freeform tag key applied to every resource for identification | string |
"k3s-provisioner" |
no |
| unique_tag_value | Freeform tag value applied to every resource for identification | string |
"https://github.com/mbologna/k3s-oci" |
no |
| worker_memory_in_gbs | RAM in GB per worker node. | number |
6 |
no |
| worker_ocpus | OCPUs per worker node. | number |
1 |
no |
| Name | Description |
|---|---|
| argocd_initial_password_hint | Command to retrieve the ArgoCD initial admin password (run after cluster is up) |
| bastion_ocid | OCID of the OCI Bastion Service resource (null if enable_bastion = false). Use with example/get-kubeconfig.sh or oci bastion session create-managed-ssh. |
| grafana_admin_credentials | Grafana admin credentials (only available after cluster bootstrap) |
| internal_lb_ip | Private IP of the internal load balancer (used by agents to join the cluster) |
| k3s_servers_private_ips | Private IPs of k3s control-plane nodes |
| k3s_standalone_worker_private_ip | Private IP of the standalone worker node (oci_core_instance, not pool-managed) |
| k3s_token | k3s cluster join token (sensitive) |
| k3s_workers_private_ips | Private IPs of k3s worker nodes (instance pool) |
| kubeconfig_hint | How to retrieve kubeconfig after cluster is up |
| longhorn_backup_setup | Instructions to connect Longhorn to the OCI Object Storage backup bucket. Null if enable_longhorn_backup = false. |
| longhorn_ui_credentials | Longhorn UI credentials (only set when longhorn_hostname is configured) |
| mysql_admin_credentials | MySQL HeatWave admin credentials (sensitive). Null if enable_mysql = false. |
| mysql_endpoint | MySQL HeatWave connection endpoint (hostname:port). Null if enable_mysql = false. |
| notification_topic_endpoint | OCI Notifications HTTPS endpoint for the Alertmanager webhook receiver (null if enable_notifications = false). |
| oci_log_group_id | OCI Log Group OCID for k3s cloud-init logs (null if enable_oci_logging = false) |
| public_nlb_ip | Public IP address of the NLB (point your DNS here) |
| ssh_command | SSH command to connect to a cluster node via the public NLB (null if expose_ssh = false). Routes to any available server. |
| terraform_state_backend | S3-compatible backend config snippet for storing Terraform state in the provisioned OCI Object Storage bucket. Replace and add S3 credentials (OCI Customer Secret Key). |
| vault_id | OCI Vault OCID (null if enable_vault = false) |