Skip to content

Commit 1c981eb

Browse files
authored
MRG: Merge pull request #1 from octue/create-initial-module
Create initial module
2 parents ab172aa + 8883ecc commit 1c981eb

19 files changed

+756
-3
lines changed

.github/workflows/release.yml

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# This workflow releases a new version of the Terraform module.
2+
3+
name: Release
4+
5+
# Only trigger when a pull request into main branch is merged.
6+
on:
7+
pull_request:
8+
types: [closed]
9+
branches:
10+
- main
11+
12+
jobs:
13+
release:
14+
runs-on: ubuntu-latest
15+
steps:
16+
- name: Checkout Repository
17+
uses: actions/checkout@v4
18+
19+
- name: Get module version
20+
id: get-version
21+
run: echo "version=$(cat VERSION.txt)" >> $GITHUB_OUTPUT
22+
23+
- name: Create Release
24+
uses: actions/create-release@v1
25+
env:
26+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # This token is provided by Actions, no need to create your own.
27+
with:
28+
tag_name: ${{ steps.get-version.outputs.version }}
29+
release_name: ${{ github.event.pull_request.title }}
30+
body: ${{ github.event.pull_request.body }}
31+
draft: false
32+
prerelease: false
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# This workflow updates the pull request description with an auto-generated section containing the categorised commit
2+
# message headers of the pull request's commits. The auto generated section is enveloped between two comments:
3+
# "<!--- START AUTOGENERATED NOTES --->" and "<!--- END AUTOGENERATED NOTES --->". Anything outside these in the
4+
# description is left untouched. Auto-generated updates can be skipped for a commit if
5+
# "<!--- SKIP AUTOGENERATED NOTES --->" is added to the pull request description.
6+
7+
name: update-pull-request
8+
9+
on: [pull_request]
10+
11+
jobs:
12+
description:
13+
uses: octue/workflows/.github/workflows/generate-pull-request-description.yml@main
14+
secrets:
15+
token: ${{ secrets.GITHUB_TOKEN }}
16+
permissions:
17+
contents: read
18+
pull-requests: write

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,4 +159,6 @@ cython_debug/
159159
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
160160
# and can be added to the global gitignore or merged into this file. For a more nuclear
161161
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
162-
#.idea/
162+
.idea/
163+
164+
gcp-cred*

README.md

Lines changed: 235 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,235 @@
1-
# terraform-octue-twined
2-
A terraform module for creating an Octue service network.
1+
# terraform-octue-twined-cluster
2+
A terraform module for deploying a Kubernetes cluster for an Octue Twined service network to GCP.
3+
4+
5+
# Infrastructure
6+
This module is designed to manage multiple environments (e.g. testing, staging, production) in the same GCP project
7+
simultaneously. Environments provide isolated Twined service networks that can't easily interact with service networks
8+
in other environments.
9+
10+
These resources are automatically deployed for each given environment:
11+
- An autopilot GKE Kubernetes cluster for running Twined service containers on. [Kueue](https://kueue.sigs.k8s.io/) is
12+
installed on the cluster to provide a queueing system where questions sent to Twined services are treated as jobs
13+
- A Kueue cluster queue, local queue, and default resource flavour to implement the job queueing system on the cluster
14+
- A Pub/Sub topic for all Twined service events to be published to
15+
- An event handler cloud function that stores all events in the event store and dispatches question events to the
16+
Kubernetes cluster as Kueue jobs
17+
- A service registry cloud function providing an HTTP endpoint for checking if an image exists in the artifact registry
18+
repository for any requested service revisions
19+
- An IAM service account and roles mapped to a Kubernetes service account for the cluster to use to access the resources
20+
deployed by the [terraform-octue-twined-core](https://github.com/octue/terraform-octue-twined-core) Terraform module
21+
- IAM roles for relevant google service agents
22+
23+
24+
# Installation and usage
25+
26+
> [!IMPORTANT]
27+
> This Terraform module must be deployed **after** the
28+
> [terraform-octue-twined-core](https://github.com/octue/terraform-octue-twined-core) module in the same GCP project.
29+
> Both must be deployed to have a cloud-based Octue Twined services network. See
30+
> [a live example here](https://github.com/octue/twined-infrastructure).
31+
32+
> [!TIP]
33+
> Deploy this module in a separate Terraform configuration (directory/workspace) to the
34+
> [terraform-octue-twined-core](https://github.com/octue/terraform-octue-twined-core)
35+
> module. This allows the option to spin down the Kubernetes cluster while keeping the core resources that contain all
36+
> data produced by your Twined services available. Spinning the cluster down entirely can save on running costs in
37+
> periods of extended non-use while keeping all data available.
38+
39+
Add the below blocks to your Terraform configuration and run:
40+
```shell
41+
terraform plan
42+
```
43+
44+
If you're happy with the plan, run:
45+
```shell
46+
terraform apply
47+
```
48+
and approve the run. This will create resources whose names/IDs are prefixed with `<environment>-` where `<environment>`
49+
is `main` by default.
50+
51+
## Environments
52+
The suggested way of managing environments is via [Terraform workspaces](https://developer.hashicorp.com/terraform/language/state/workspaces).
53+
You can get started right away with the `main` environment by removing the `environment` input to the module.
54+
55+
To create and used other environments, see the example configuration below. It contains a `locals` block that
56+
automatically generates the environment name from the name of the current Terraform workspace by taking the text after
57+
the final hyphen. This supports uniquely named environments in Terraform Cloud (which must be unique within the
58+
organisation) while keeping the environment prefix short but unique within your GCP project. For this to work well,
59+
ensure your Terraform workspace names are slugified.
60+
61+
For example, if your Terraform workspace was called `my-project-testing`, the environment would be called `testing` and
62+
your resources would be named like this:
63+
- Pub/Sub topic: `testing.octue.services`
64+
- Event handler: `testing-octue-twined-service-event-handler`
65+
- Service registry: `testing-octue-twined-service-registry`
66+
- Kubernetes cluster: `testing-octue-twined-cluster`
67+
68+
69+
## Example configuration
70+
71+
```terraform
72+
# main.tf
73+
74+
terraform {
75+
required_version = ">= 1.8.0"
76+
77+
required_providers {
78+
google = {
79+
source = "hashicorp/google"
80+
version = "~>6.12"
81+
}
82+
kubernetes = {
83+
source = "hashicorp/kubernetes"
84+
version = "~>2.35"
85+
}
86+
kubectl = {
87+
source = "gavinbunney/kubectl"
88+
version = "~>1.19"
89+
}
90+
}
91+
92+
}
93+
94+
95+
provider "google" {
96+
project = var.google_cloud_project_id
97+
region = var.google_cloud_region
98+
}
99+
100+
101+
data "google_client_config" "default" {}
102+
103+
104+
provider "kubernetes" {
105+
host = "https://${module.octue_twined_cluster.kubernetes_cluster.endpoint}"
106+
token = data.google_client_config.default.access_token
107+
cluster_ca_certificate = base64decode(module.octue_twined_cluster.kubernetes_cluster.master_auth[0].cluster_ca_certificate)
108+
}
109+
110+
111+
provider "kubectl" {
112+
load_config_file = false
113+
host = "https://${module.octue_twined_cluster.kubernetes_cluster.endpoint}"
114+
token = data.google_client_config.default.access_token
115+
cluster_ca_certificate = base64decode(module.octue_twined_cluster.kubernetes_cluster.master_auth[0].cluster_ca_certificate)
116+
}
117+
118+
119+
# Set the environment name to the last part of the workspace name when split on hyphens.
120+
locals {
121+
workspace_split = split("-", terraform.workspace)
122+
environment = element(local.workspace_split, length(local.workspace_split) - 1)
123+
}
124+
125+
126+
module "octue_twined_cluster" {
127+
source = "git::github.com/octue/terraform-octue-twined-cluster.git?ref=0.1.0"
128+
google_cloud_project_id = var.google_cloud_project_id
129+
google_cloud_region = var.google_cloud_region
130+
environment = local.environment
131+
cluster_queue = var.cluster_queue
132+
}
133+
```
134+
135+
```terraform
136+
# variables.tf
137+
138+
variable "google_cloud_project_id" {
139+
type = string
140+
default = "<google-cloud-project-id>"
141+
}
142+
143+
144+
variable "google_cloud_region" {
145+
type = string
146+
default = "<google-cloud-region>"
147+
}
148+
149+
150+
variable "cluster_queue" {
151+
type = object(
152+
{
153+
name = string
154+
max_cpus = number
155+
max_memory = string
156+
max_ephemeral_storage = string
157+
}
158+
)
159+
default = {
160+
name = "cluster-queue"
161+
max_cpus = 100
162+
max_memory = "256Gi"
163+
max_ephemeral_storage = "10Gi"
164+
}
165+
}
166+
```
167+
168+
## Dependencies
169+
- Terraform: `>= 1.8.0, <2`
170+
- Providers:
171+
- `hashicorp/google`: `~>6.12`
172+
- `hashicorp/kubernetes`: `~>2.35`
173+
- `gavinbunney/kubectl`: `~>1.19`
174+
- Google cloud APIs:
175+
- The Cloud Resource Manager API must be [enabled manually](https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com)
176+
before using the module
177+
- All other required google cloud APIs are enabled automatically by the module
178+
179+
## Authentication
180+
181+
> [!TIP]
182+
> You can use the same service account as created for the [terraform-octue-twined-core](https://github.com/octue/terraform-octue-twined-core?tab=readme-ov-file#authentication)
183+
> module to skip steps 1 and 2.
184+
185+
The module needs to authenticate with google cloud before it can be used:
186+
187+
1. Create a service account for Terraform and assign it the `editor` and `owner` basic IAM permissions
188+
2. Download a JSON key file for the service account
189+
3. If using Terraform Cloud, follow [these instructions](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#using-terraform-cloud).
190+
before deleting the key file from your computer
191+
4. If not using Terraform Cloud, follow [these instructions](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#authentication-configuration)
192+
or use another [authentication method](https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/provider_reference#authentication)
193+
194+
## Destruction
195+
> [!WARNING]
196+
> If the `deletion_protection` input is set to `true`, it must first be set to `false` and `terraform apply` run before
197+
> running `terraform destroy` or any other operation that would result in the destruction or replacement of the cluster.
198+
> Not doing this can lead to a state needing targeted Terraform commands and/or manual configuration changes to recover
199+
> from.
200+
201+
Disable `deletion_protection` and run:
202+
```shell
203+
terraform destroy
204+
```
205+
206+
207+
# Input reference
208+
209+
| Name | Type | Required | Default |
210+
|--------------------------------------|---------------|----------|----------------------------------------------------------------------------------------|
211+
| `google_cloud_project_id` | `string` | Yes | N/A |
212+
| `google_cloud_region` | `string` | Yes | N/A |
213+
| `maintainer_service_account_names` | `set(string)` | No | `["default"]` |
214+
| `environment` | `string` | No | `"main"` |
215+
| `maximum_event_handler_instances` | `number` | No | `100` |
216+
| `maximum_service_registry_instances` | `number` | No | `10` |
217+
| `deletion_protection` | `bool` | No | `true` |
218+
| `kueue_version` | `string` | No | `"v0.10.1"` |
219+
| `question_default_resources` | `object` | No | `{cpus=1, memory="512Mi", ephemeral_storage="1Gi"}` |
220+
| `cluster_queue` | `object` | No | `{name="cluster-queue", max_cpus=10, max_memory="10Gi", max_ephemeral_storage="10Gi"}` |
221+
| `local_queue` | `object` | No | `{name="local-queue"}` |
222+
223+
224+
See [`variables.tf`](/variables.tf) for descriptions.
225+
226+
227+
# Output reference
228+
229+
| Name | Type |
230+
|------------------------|----------|
231+
| `service_registry_url` | `string` |
232+
| `services_topic_name` | `string` |
233+
| `kubernetes_cluster` | `string` |
234+
235+
See [`outputs.tf`](/outputs.tf) for descriptions.

VERSION.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
0.1.0

functions.tf

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
locals {
2+
artifact_registry_repository_name = "octue-twined-services"
3+
}
4+
5+
resource "google_cloudfunctions2_function" "event_handler" {
6+
name = "${var.environment}-octue-twined-service-event-handler"
7+
description = "A function for handling events from Octue Twined services."
8+
location = var.google_cloud_region
9+
10+
build_config {
11+
runtime = "python312"
12+
entry_point = "handle_event"
13+
source {
14+
storage_source {
15+
bucket = "twined-gcp"
16+
object = "event_handler/0.7.2.zip"
17+
}
18+
}
19+
}
20+
21+
service_config {
22+
max_instance_count = var.maximum_event_handler_instances
23+
available_memory = "256M"
24+
timeout_seconds = 60
25+
ingress_settings = "ALLOW_INTERNAL_ONLY"
26+
environment_variables = {
27+
ARTIFACT_REGISTRY_REPOSITORY_URL = "${var.google_cloud_region}-docker.pkg.dev/${var.google_cloud_project_id}/${local.artifact_registry_repository_name}"
28+
BIGQUERY_EVENTS_TABLE = "octue_twined.service-events"
29+
KUBERNETES_CLUSTER_ID = google_container_cluster.primary.id
30+
KUBERNETES_SERVICE_ACCOUNT_NAME = kubernetes_service_account.default.metadata[0].name
31+
KUEUE_LOCAL_QUEUE = var.local_queue.name
32+
OCTUE_SERVICES_TOPIC_NAME = google_pubsub_topic.services_topic.name
33+
QUESTION_DEFAULT_CPUS = var.question_default_resources.cpus
34+
QUESTION_DEFAULT_MEMORY = var.question_default_resources.memory
35+
QUESTION_DEFAULT_EPHEMERAL_STORAGE = var.question_default_resources.ephemeral_storage
36+
}
37+
}
38+
39+
event_trigger {
40+
event_type = "google.cloud.pubsub.topic.v1.messagePublished"
41+
pubsub_topic = google_pubsub_topic.services_topic.id
42+
trigger_region = var.google_cloud_region
43+
retry_policy = "RETRY_POLICY_RETRY"
44+
}
45+
46+
depends_on = [time_sleep.wait_for_google_apis_to_enable]
47+
}
48+
49+
50+
resource "google_cloudfunctions2_function" "service_registry" {
51+
name = "${var.environment}-octue-twined-service-registry"
52+
description = "A lightweight service registry for Octue Twined services running on Kueue."
53+
location = var.google_cloud_region
54+
55+
build_config {
56+
runtime = "python312"
57+
entry_point = "handle_request"
58+
source {
59+
storage_source {
60+
bucket = "twined-gcp"
61+
object = "service_registry/0.7.0.zip"
62+
}
63+
}
64+
}
65+
66+
service_config {
67+
max_instance_count = var.maximum_service_registry_instances
68+
available_memory = "256M"
69+
timeout_seconds = 60
70+
environment_variables = {
71+
ARTIFACT_REGISTRY_REPOSITORY_ID = "projects/${var.google_cloud_project_id}/locations/${var.google_cloud_region}/repositories/${local.artifact_registry_repository_name}"
72+
}
73+
}
74+
depends_on = [time_sleep.wait_for_google_apis_to_enable]
75+
}

0 commit comments

Comments
 (0)