Skip to content

Commit 1c03ef9

Browse files
authored
[PRA-11] Automated tutorial testing (#221)
Added automated end-to-end tests for the tutorial.
1 parent de324cc commit 1c03ef9

18 files changed

Lines changed: 1420 additions & 220 deletions

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@ override.tf.json
2828
# Include override files you do wish to add to version control using negated pattern
2929
# !example_override.tf
3030

31+
### Tutorial test artifacts ###
32+
# Generated by extract_commands.py from Markdown tutorial sources
33+
python/tests/tutorial/*.sh
34+
!python/tests/tutorial/helpers.sh
35+
python/tests/tutorial/*/task.yaml
36+
3137
# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
3238
# example: *tfplan*
3339

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,24 @@ For information on security features and the use of cryptography, see the [Secur
4343

4444
Security issues in the Charmed Apache Spark can be reported through [LaunchPad](https://wiki.ubuntu.com/DebuggingSecurity). Please do not file GitHub issues about security issues.
4545

46+
## Tutorial tests
47+
48+
The tutorial (`docs/tutorial/`) is tested end-to-end using [Spread](https://github.com/canonical/spread) inside a Multipass VM. Shell commands are extracted directly from the Markdown sources, so the tutorial itself is the test.
49+
50+
Run the full tutorial suite (extract + run Spread tests):
51+
52+
```bash
53+
tox -e tutorial
54+
```
55+
56+
Only generate test scripts from the Markdown tutorial pages (no VM needed):
57+
58+
```bash
59+
tox -e tutorial-extract
60+
```
61+
62+
Both commands must be run from the `python/` directory (or via `cd python && tox …`). See [python/tests/tutorial/TESTING.md](python/tests/tutorial/TESTING.md) for prerequisites, run modes, debug tips, and the full annotation reference.
63+
4664
## Contributing
4765

4866
Canonical welcomes contributions to Charmed Apache Spark. Please check out our [contribution guidelines](python/CONTRIBUTING.md) if you're interested in contributing to the solution. If you truly enjoy working on open-source projects like this one and you would like to be part of the OSS revolution, please don't forget to check out the [career opportunities](https://canonical.com/careers/all) we have at [Canonical](https://canonical.com/).

docs/tutorial/1-environment-setup.md

Lines changed: 62 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,11 @@ myst:
44
description: "Learn how to set up MicroK8s, spark-client snap, MinIO, and Juju on Ubuntu for running Charmed Apache Spark on Kubernetes."
55
---
66

7+
<!-- test:spread
8+
priority: 700
9+
kill-timeout: 20m
10+
-->
11+
712
(tutorial-1-environment-setup)=
813
# 1. Environment setup
914

@@ -63,7 +68,7 @@ For the purpose of this tutorial we will be using a lightweight Kubernetes: [Mic
6368

6469
Installing MicroK8s is as simple as running the following command:
6570

66-
```bash
71+
```shell
6772
sudo snap install microk8s --channel=1.32-strict/stable
6873
```
6974

@@ -75,19 +80,19 @@ Let's configure MicroK8s so that the currently logged-in user has admin rights t
7580

7681
First, set an alias `kubectl` that can be used instead of `microk8s.kubectl`:
7782

78-
```bash
83+
```shell
7984
sudo snap alias microk8s.kubectl kubectl
8085
```
8186

8287
Then, add the current user into `microk8s` group:
8388

84-
```bash
89+
```shell
8590
sudo usermod -a -G snap_microk8s ${USER}
8691
```
8792

8893
Create and provide ownership of `~/.kube` directory to the current user:
8994

90-
```bash
95+
```shell
9196
mkdir -p ~/.kube
9297
sudo chown -f -R ${USER} ~/.kube
9398
```
@@ -98,9 +103,16 @@ Put the group membership changes into effect:
98103
newgrp snap_microk8s
99104
```
100105

106+
<!-- test:run
107+
# Activate snap_microk8s group for the current process (newgrp is interactive-only)
108+
newgrp snap_microk8s << 'NEWGRP_EOF'
109+
true
110+
NEWGRP_EOF
111+
-->
112+
101113
Check the status of the MicroK8s:
102114

103-
```bash
115+
```shell
104116
microk8s status --wait-ready
105117
```
106118

@@ -124,13 +136,13 @@ addons:
124136
Let's generate a Kubernetes configuration file using MicroK8s and write it to `~/.kube/config`.
125137
This is where `kubectl` looks for the `kubeconfig` file by default.
126138

127-
```bash
139+
```shell
128140
microk8s config | tee ~/.kube/config
129141
```
130142

131143
Now let's enable a few add-ons for using features like role based access control, usage of local volume for storage, and load balancing.
132144

133-
```bash
145+
```shell
134146
sudo microk8s enable rbac
135147
sudo microk8s enable storage hostpath-storage
136148

@@ -143,7 +155,7 @@ sudo microk8s enable metallb:$IP_ADDR_START-$IP_ADDR_END
143155

144156
Wait for the commands to finish running and check the list of enabled add-ons:
145157

146-
```bash
158+
```shell
147159
microk8s status --wait-ready
148160
```
149161

@@ -171,19 +183,19 @@ The MicroK8s setup is complete.
171183
For Apache Spark jobs to be running run on top of Kubernetes, a set of resources (ServiceAccount, associated Roles, RoleBindings etc.) need to be created and configured.
172184
To simplify this task, the Charmed Apache Spark solution offers the `spark-client` snap. Install the snap:
173185

174-
```bash
186+
```shell
175187
sudo snap install spark-client --channel 3.4/edge
176188
```
177189

178190
Let's create a Kubernetes namespace for us to use as a playground in this tutorial.
179191

180-
```bash
192+
```shell
181193
kubectl create namespace spark
182194
```
183195

184196
We will now create a ServiceAccount that will be used to run the Spark jobs. The creation of the ServiceAccount can be done using the `spark-client` snap, which will create necessary Roles, RoleBindings and other necessary configurations along with the creation of the ServiceAccount:
185197

186-
```bash
198+
```shell
187199
spark-client.service-account-registry create \
188200
--username spark --namespace spark
189201
```
@@ -192,7 +204,7 @@ This command does a number of things in the background. First, it creates a Serv
192204

193205
These resources can be viewed with `kubectl get` commands as follows:
194206

195-
```bash
207+
```shell
196208
kubectl get serviceaccounts -n spark
197209
kubectl get roles -n spark
198210
kubectl get rolebindings -n spark
@@ -257,7 +269,7 @@ We'll use `juju` to deploy and manage the Spark History Server and a number of o
257269

258270
To install and configure a `juju` client using a snap:
259271

260-
```bash
272+
```shell
261273
sudo snap install juju
262274
mkdir -p ~/.local/share
263275
```
@@ -280,18 +292,20 @@ microk8s 1 localhost k8s 0 built-in A Kubernetes Cluster
280292
As you can see, Juju has detected LXD as well as K8s installation in the system.
281293
For us to be able to deploy Kubernetes charms, let's bootstrap a Juju controller in the `microk8s` cloud:
282294

283-
```bash
295+
```shell
284296
juju bootstrap microk8s spark-tutorial
285297
```
286298

299+
<!-- test:wait --seconds 30 -->
300+
287301
The creation of the new controller can be verified with the `juju controllers` command.
288302
The output of the command should be similar to:
289303

290304
```text
291305
Use --refresh option with this command to see the latest information.
292306
293307
Controller Model User Access Cloud/Region Models Nodes HA Version
294-
spark-tutorial* - admin superuser microk8s/localhost 1 1 - 3.6.14
308+
spark-tutorial* - admin superuser microk8s/localhost 1 1 - 3.6.21
295309
```
296310

297311
The Juju setup is complete.
@@ -304,16 +318,18 @@ It is available as a MicroK8s [add-on](https://microk8s.io/docs/addon-minio) by
304318

305319
Let's enable the MinIO add-on for MicroK8s.
306320

307-
```bash
321+
```shell
308322
sudo microk8s enable minio
309323
```
310324

325+
<!-- test:wait --seconds 60 -->
326+
311327
Authentication with MinIO is managed with an access key and a secret key.
312328
These credentials are generated and stored as Kubernetes secret when the MinIO add-on is enabled.
313329

314330
Let's fetch credentials and export them as environment variables in order to use them later:
315331

316-
```bash
332+
```shell
317333
export ACCESS_KEY=$(kubectl get secret -n minio-operator microk8s-user-1 -o jsonpath='{.data.CONSOLE_ACCESS_KEY}' | base64 -d)
318334
export SECRET_KEY=$(kubectl get secret -n minio-operator microk8s-user-1 -o jsonpath='{.data.CONSOLE_SECRET_KEY}' | base64 -d)
319335
export S3_ENDPOINT=$(kubectl get service minio -n minio-operator -o jsonpath='{.spec.clusterIP}')
@@ -324,7 +340,7 @@ The MinIO add-on offers access to a built-in Web UI which can be used to interac
324340

325341
To set up the AWS CLI, run the following commands:
326342

327-
```bash
343+
```shell
328344
sudo snap install aws-cli --classic
329345

330346
aws configure set aws_access_key_id $ACCESS_KEY
@@ -335,6 +351,13 @@ aws configure set endpoint_url "http://$S3_ENDPOINT"
335351

336352
Check the tool by listing all S3 buckets:
337353

354+
<!-- test:run
355+
# Retry aws s3 ls until MinIO is ready (may take a moment after addon enable)
356+
for i in $(seq 1 12); do
357+
aws s3 ls && break || sleep 10
358+
done
359+
-->
360+
338361
```bash
339362
aws s3 ls
340363
```
@@ -349,14 +372,14 @@ Let's proceed to create a new one.
349372

350373
To create the `spark-tutorial` bucket using AWS CLI, run:
351374

352-
```bash
375+
```shell
353376
aws s3 mb s3://spark-tutorial
354377
```
355378

356379
We now have an S3 bucket available locally on our system!
357380
See for yourself by running the same command to list all buckets:
358381

359-
```bash
382+
```shell
360383
aws s3 ls
361384
```
362385

@@ -370,7 +393,7 @@ In the Charmed Apache Spark solution, these configurations are stored in a Secre
370393

371394
The S3 configurations can be added to the existing `spark` service account with the following command:
372395

373-
```bash
396+
```shell
374397
spark-client.service-account-registry add-config \
375398
--username spark --namespace spark \
376399
--conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider \
@@ -383,7 +406,7 @@ spark-client.service-account-registry add-config \
383406

384407
Now check the list of configurations bound for the service account:
385408

386-
```bash
409+
```shell
387410
spark-client.service-account-registry get-config \
388411
--username spark --namespace spark
389412
```
@@ -403,7 +426,7 @@ spark.kubernetes.namespace=spark
403426

404427
You can also see the configuration stored in a Kubernetes secret:
405428

406-
```bash
429+
```shell
407430
kubectl get secret -n spark -o yaml
408431
```
409432

@@ -444,33 +467,38 @@ With that, the basic environment setup is complete!
444467
Throughout this tutorial you will create service accounts in several Kubernetes namespaces.
445468
Running `add-config` to push S3 credentials to every new account manually would quickly become repetitive.
446469
The [Integration Hub for Apache Spark](https://charmhub.io/spark-integration-hub-k8s) automates this:
447-
once deployed and integrated with an S3-compatible storage, it automatically pushes the storage credentials
470+
once deployed and configured with the `monitored-service-accounts` option, it automatically pushes the storage credentials
448471
to every service account that is managed by `spark-client` — including accounts created in the future.
449472

450473
Let's deploy it now so that service accounts we create in later steps receive S3 credentials automatically.
451474

452475
Create a dedicated Juju model for the Integration Hub:
453476

454-
```bash
477+
```shell
455478
juju add-model spark-integration-hub
456479
```
457480

458481
Deploy the Integration Hub charm and an `s3-integrator` charm to supply it with the storage configuration:
459482

460-
```bash
483+
```shell
461484
juju deploy spark-integration-hub-k8s --channel 3/stable --trust
485+
juju config spark-integration-hub-k8s monitored-service-accounts="*:*"
462486
juju deploy s3-integrator --channel 1/stable
463487
juju config s3-integrator bucket=spark-tutorial path=spark-events endpoint=http://$S3_ENDPOINT
464488
```
465489

490+
<!-- test:await-idle --timeout 600 --allow-blocked s3-integrator -->
491+
466492
The `s3-integrator` will remain in `blocked` state until S3 credentials are provided. Set the credentials first, then integrate:
467493

468-
```bash
494+
```shell
469495
juju run s3-integrator/leader sync-s3-credentials \
470496
access-key=$ACCESS_KEY secret-key=$SECRET_KEY
471497
juju integrate s3-integrator spark-integration-hub-k8s
472498
```
473499

500+
<!-- test:await-idle --timeout 600 -->
501+
474502
Wait for both charms to reach `active/idle` status:
475503

476504
```bash
@@ -479,7 +507,7 @@ watch juju status --color
479507

480508
Verify that the Integration Hub has automatically updated the `spark` service account:
481509

482-
```bash
510+
```shell
483511
spark-client.service-account-registry get-config \
484512
--username spark --namespace spark
485513
```
@@ -491,11 +519,16 @@ the S3 credentials automatically.
491519

492520
Create the `spark-events` and `warehouse` directories in S3:
493521

494-
```bash
522+
```shell
495523
aws s3api put-object --bucket spark-tutorial --key spark-events/
496524
aws s3api put-object --bucket spark-tutorial --key warehouse/
497525
```
498526

527+
<!-- test:assert
528+
spark-client.service-account-registry get-config --username spark --namespace spark | grep -q "fs.s3a.endpoint"
529+
juju status -m spark-integration-hub --format=json | jq -e '.applications."spark-integration-hub-k8s"."application-status".current == "active"'
530+
-->
531+
499532
## (Optional) Create a snapshot
500533

501534
At this stage, you may want to create a [snapshot](https://documentation.ubuntu.com/multipass/en/latest/reference/command-line-interface/snapshot/#snapshot) of the current state, for which you need to stop the Multipass VM. Exit the VM by pressing `CTRL + D` and stop it:

0 commit comments

Comments
 (0)