Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 0 additions & 28 deletions .github/workflows/tilt-ci.yml

This file was deleted.

20 changes: 20 additions & 0 deletions .github/workflows/validate-tilt.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
on:
pull_request:
push:
branches:
- main
name: Validate Tilt
jobs:
validate:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install kubectl and Tilt
uses: yokawasa/action-setup-kube-tools@v0.13.4
with:
setup-tools: |
kubectl
tilt
- name: Validate Tiltfile
run: make validate-tilt
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -149,3 +149,6 @@ test-tparse: $(TPARSE)

tools: # @HELP Install tools.
tools: $(TOOLS)

validate-tilt: # @HELP Validate the Tiltfile and Kubernetes manifests.
tilt alpha tiltfile-result > /dev/null
235 changes: 60 additions & 175 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# preprocessing-base

Enduro preprocessing child workflow base repository. This project is a basic
example and a template to be used to create new preprocessing child workflows
to be executed as part of the processing workflow in Enduro.
Enduro child workflow base repository. This project is a basic example and a
template for new Enduro child workflow projects.

Despite the `preprocessing-base` name, this repository is also a useful starting
point for projects that implement other workflow types, or projects that run
multiple workflows and Temporal workers from the same binary.

- [Existing repositories](#existing-repositories)
- [Create a new repository](#create-a-new-repository)
Expand All @@ -13,12 +16,14 @@ to be executed as part of the processing workflow in Enduro.

## Existing repositories

Projects based on this template include:

- [preprocessing-sfa](https://github.com/artefactual-sdps/preprocessing-sfa)
- [preprocessing-moma](https://github.com/artefactual-sdps/preprocessing-moma)
- [cva-enduro-workflows](https://github.com/artefactual-sdps/cva-enduro-workflows)

## Create a new repository

To create a new preprocessing child workflow project:
To create a new Enduro child workflow project:

- Use this repository as a template:
- With the link in the top right corner of this page
Expand All @@ -29,42 +34,51 @@ To create a new preprocessing child workflow project:
- The Makefile project name and the location of the installed tools
- The `appName` in the worker command
- The gci "prefix" section in the `.golangci.yml` config file
- The `CHILD_WORKFLOW_PATHS` in the local environment setup below.
- Update this readme file:
- Change the heading and initial description
- Remove the first three sections from the list above and the content
- Update the configuration based on the workflow implementation
- Update the Enduro `childWorkflows` configuration examples

## Repository requirements

This project is configured by default to be able to run the child workflow
worker inside the Enduro cluster. For that to work as expected, the projects
based on this repository must contain a `Tiltfile.enduro` file to load only
the required resources into the cluster. One of those resources must be a
persistent volume claim called `preprocessing-pvc` that will be mounted in
Enduro's a3m or Archivematica worker to be able to share the filesystem with
the preprocessing worker. This solution will only work in single node
Kubernetes clusters.
Projects based on this repository are expected to run as child workflow workers
inside the Enduro development environment. Enduro owns the local development
cluster and core services, including Temporal, MySQL and shared infrastructure.
This repository contributes only the child workflow resources needed by its
worker.

The default template includes a persistent volume claim called
`preprocessing-pvc`, mounted at `/home/preprocessing/shared` in the child
workflow worker. Enduro can mount the same volume in its a3m or Archivematica
worker so both workers can share files. This shared-volume development setup is
intended for single-node Kubernetes clusters.

Projects that do not need a shared filesystem, or that implement
non-preprocessing workflow types, should replace these manifests with the
resources required by their own workflow workers.

Check the [Enduro documentation] to enable and configure the execution of
child workflows in that environment.
Check the [Enduro development manual] for the current development environment
setup.

## Configuration

The preprocessing workers need to share the filesystem with Enduro's a3m or
Archivematica workers. They must be connected to the same Temporal server
and related to each other with the namespace, task queue and workflow name.
The default preprocessing worker needs to share the filesystem with Enduro's
a3m or Archivematica workers. It must connect to the same Temporal server and be
related to Enduro with the correct namespace, task queue and workflow name.

### Preprocessing
### Worker configuration

The required configuration for the preprocessing worker:
The required configuration for the default preprocessing worker:

```toml
debug = false
verbosity = 0
sharedPath = "/home/enduro/preprocessing"
sharedPath = "/home/preprocessing/shared"

[temporal]
address = "temporal.enduro-sdps:7233"
address = "temporal-frontend.enduro-sdps:7233"
namespace = "default"
taskQueue = "preprocessing"
workflowName = "preprocessing"
Expand All @@ -73,7 +87,7 @@ workflowName = "preprocessing"
maxConcurrentSessions = 1
```

Optional BagIt bag configuration (default values shown):
Optional BagIt bag configuration:

```toml
[bagit]
Expand All @@ -82,167 +96,48 @@ checksumAlgorithm = "sha512"

### Enduro

The preprocessing section for Enduro's configuration:
The child workflow section for Enduro's configuration:

```toml
[preprocessing]
enabled = true
extract = false
sharedPath = "/home/enduro/preprocessing"

[preprocessing.temporal]
[[childWorkflows]]
type = "preprocessing"
namespace = "default"
taskQueue = "preprocessing"
workflowName = "preprocessing"
extract = false
sharedPath = "/home/preprocessing/shared"
```

## Local environment

### Requirements

This project uses Tilt to set up a local environment building the Docker images
in a Kubernetes cluster. It has been tested with k3d, Minikube and Kind.
The supported development workflow is to run `tilt up` from the Enduro
repository and load this repository through Enduro's `CHILD_WORKFLOW_PATHS`
mechanism.

- [Docker] (v18.09+)
- [kubectl]
- [Tilt] (v0.22.2+)
Bring up the Enduro environment by following the [Enduro development manual].

A local Kubernetes cluster:
### Set up

- [k3d] _(recommended, used in CI)_
- [Minikube] _(tested)_
- [Kind] _(tested)_
The specific requirements for this template are:

It can run with other solutions like Microk8s or Docker for Desktop/Mac and
even against remote clusters, check Tilt's [Choosing a Local Dev Cluster] and
[Install] documentation for more information to install these requirements.
- clone this repository as a sibling of the Enduro repository
- configure `CHILD_WORKFLOW_PATHS=../preprocessing-base`
- configure `MOUNT_PREPROCESSING_VOLUME=true`
- run `tilt up` from the Enduro repository

Additionally, follow the [Manage Docker as a non-root user] post-install guide
so that you don’t have to run Tilt with `sudo`. _Note that managing Docker as a
non-root user is **different** from running the docker daemon as a non-root user
(rootless)._
All other development workflow details, including `.tilt.env`, live updates,
starting, stopping and clearing the environment, are documented in Enduro. This
repository can also provide local overrides through its own `.tilt.env` file,
including settings such as `TRIGGER_MODE_AUTO`.

### Requirements for development

While we run the services inside a Kubernetes cluster we recommend installing
Go and other tools locally to ease the development process.
While we run the services inside a Kubernetes cluster we recommend installing Go
and other tools locally to ease the development process.

- [Go] (1.22+)
- [Go] (1.24+)
- GNU [Make] and [GCC]

### Set up

Start a local Kubernetes cluster with a local registry. For example, with k3d:

```bash
k3d cluster create preprocessing --registry-create sdps-registry
```

Or using an existing registry:

```bash
k3d cluster create preprocessing --registry-use sdps-registry
```

Make sure kubectl is available and configured to use that cluster:

```bash
kubectl config view
```

Clone this repository and move into its folder if you have not done that
previously:

```bash
git clone git@github.com:artefactual-sdps/preprocessing-base.git
cd preprocessing-base
```

Bring up the environment:

```bash
tilt up
```

While the Docker images are built/downloaded and the Kubernetes resources are
created, hit `space` to open the Tilt UI in your browser. Check the [Tilt UI]
documentation to learn more about it.

### Live updates

Tilt, by default, will watch for file changes in the project folder and it will
sync those changes, rebuild the Docker images and recreate the resources when
necessary. However, we have _disabled_ auto-load within the Tiltfile to reduce
the use of hardware resources. There are refresh buttons on each resource in the
Tilt UI that allow triggering manual updates and re-executing jobs and local
resources. You can also set the `trigger_mode` env string to `TRIGGER_MODE_AUTO`
within your local `.tilt.env` file to override this change and enable auto mode.

### Stop/start the environment

Run `ctrl-c` on the terminal where `tilt up` is running and stop the cluster
with:

```bash
k3d cluster stop preprocessing
```

To start the environment again:

```bash
k3d cluster start preprocessing
tilt up
```

### Clear the cluster

> Check the Tilt UI helpers below to just flush the existing data.

To remove the resources created by Tilt in the cluster, execute:

```bash
tilt down
```

Note that it will take some time to delete the persistent volumes when you
run `tilt down` and flushing the existing data does not delete the cluster.
To delete the volumes immediately, you can delete the cluster.

### Delete the cluster

Deleting the cluster will remove all the resources immediately, deleting
cluster container from the host. With k3d, run:

```bash
k3d cluster delete preprocessing
```

### Tilt environment configuration

A few configuration options can be changed by having a `.tilt.env` file
located in the root of the project. Example:

```text
TRIGGER_MODE_AUTO=true
```

#### TRIGGER_MODE_AUTO

Enables live updates on code changes for the preprocessing worker.

### Tilt UI helpers

#### Submit

In the Tilt UI header there is a cloud icon/button that can trigger the
preprocessing workflow. Click the caret to set the path to a file/directory in
the host, then click the cloud icon to trigger the workflow.

#### Flush

Also in the Tilt UI header, click the trash button to flush the existing data.
This will recreate the MySQL databases and restart the required resources.

## Makefile

The Makefile provides developer utility scripts via command line `make` tasks.
Expand All @@ -263,17 +158,7 @@ GOARCH='amd64'
...
```

[enduro documentation]: https://github.com/artefactual-sdps/enduro/blob/main/docs/src/dev-manual/preprocessing.md
[docker]: https://docs.docker.com/get-docker/
[kubectl]: https://kubernetes.io/docs/tasks/tools/#kubectl
[tilt]: https://docs.tilt.dev/tutorial/1-prerequisites.html#install-tilt
[k3d]: https://k3d.io/v5.4.3/#installation
[minikube]: https://minikube.sigs.k8s.io/docs/start/
[kind]: https://kind.sigs.k8s.io/docs/user/quick-start#installation
[choosing a local dev cluster]: https://docs.tilt.dev/choosing_clusters.html
[install]: https://docs.tilt.dev/install.html
[manage docker as a non-root user]: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user
[tilt ui]: https://docs.tilt.dev/tutorial/3-tilt-ui.html
[Enduro development manual]: https://enduro.readthedocs.io/dev-manual/devel/
[go]: https://go.dev/doc/install
[make]: https://www.gnu.org/software/make/
[gcc]: https://gcc.gnu.org/
Loading
Loading