canonical
diff --git a/‎.gitignore‎
Lines changed: 6 additions & 0 deletions b/‎.gitignore‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 18 additions & 0 deletions b/‎README.md‎
Lines changed: 18 additions & 0 deletions
diff --git a/‎docs/tutorial/1-environment-setup.md‎
Lines changed: 62 additions & 29 deletions b/‎docs/tutorial/1-environment-setup.md‎
Lines changed: 62 additions & 29 deletions
@@ -28,6 +28,12 @@ override.tf.json
 # Include override files you do wish to add to version control using negated pattern
 # !example_override.tf
 
+### Tutorial test artifacts ###
+# Generated by extract_commands.py from Markdown tutorial sources
+python/tests/tutorial/*.sh
+!python/tests/tutorial/helpers.sh
+python/tests/tutorial/*/task.yaml
+
 # Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
 # example: *tfplan*
 
 
@@ -43,6 +43,24 @@ For information on security features and the use of cryptography, see the [Secur
 
 Security issues in the Charmed Apache Spark can be reported through [LaunchPad](https://wiki.ubuntu.com/DebuggingSecurity). Please do not file GitHub issues about security issues.
 
+## Tutorial tests
+
+The tutorial (`docs/tutorial/`) is tested end-to-end using [Spread](https://github.com/canonical/spread) inside a Multipass VM. Shell commands are extracted directly from the Markdown sources, so the tutorial itself is the test.
+
+Run the full tutorial suite (extract + run Spread tests):
+
+```bash
+tox -e tutorial
+```
+
+Only generate test scripts from the Markdown tutorial pages (no VM needed):
+
+```bash
+tox -e tutorial-extract
+```
+
+Both commands must be run from the `python/` directory (or via `cd python && tox …`). See [python/tests/tutorial/TESTING.md](python/tests/tutorial/TESTING.md) for prerequisites, run modes, debug tips, and the full annotation reference.
+
 ## Contributing
 
 Canonical welcomes contributions to Charmed Apache Spark. Please check out our [contribution guidelines](python/CONTRIBUTING.md) if you're interested in contributing to the solution. If you truly enjoy working on open-source projects like this one and you would like to be part of the OSS revolution, please don't forget to check out the [career opportunities](https://canonical.com/careers/all) we have at [Canonical](https://canonical.com/).  
 
@@ -4,6 +4,11 @@ myst:
     description: "Learn how to set up MicroK8s, spark-client snap, MinIO, and Juju on Ubuntu for running Charmed Apache Spark on Kubernetes."
 ---
 
+<!-- test:spread
+priority: 700
+kill-timeout: 20m
+-->
+
 (tutorial-1-environment-setup)=
 # 1. Environment setup
 
@@ -63,7 +68,7 @@ For the purpose of this tutorial we will be using a lightweight Kubernetes: [Mic
 
 Installing MicroK8s is as simple as running the following command:
 
-```bash
+```shell
 sudo snap install microk8s --channel=1.32-strict/stable
 ```
 
@@ -75,19 +80,19 @@ Let's configure MicroK8s so that the currently logged-in user has admin rights t
 
 First, set an alias `kubectl` that can be used instead of `microk8s.kubectl`:
 
-```bash
+```shell
 sudo snap alias microk8s.kubectl kubectl
 ```
 
 Then, add the current user into `microk8s` group:
 
-```bash
+```shell
 sudo usermod -a -G snap_microk8s ${USER}
 ```
 
 Create and provide ownership of `~/.kube` directory to the current user:
 
-```bash
+```shell
 mkdir -p ~/.kube
 sudo chown -f -R ${USER} ~/.kube
 ```
@@ -98,9 +103,16 @@ Put the group membership changes into effect:
 newgrp snap_microk8s
 ```
 
+<!-- test:run
+# Activate snap_microk8s group for the current process (newgrp is interactive-only)
+newgrp snap_microk8s << 'NEWGRP_EOF'
+true
+NEWGRP_EOF
+-->
+
 Check the status of the MicroK8s:
 
-```bash
+```shell
 microk8s status --wait-ready
 ```
 
@@ -124,13 +136,13 @@ addons:
 Let's generate a Kubernetes configuration file using MicroK8s and write it to `~/.kube/config`. 
 This is where `kubectl` looks for the `kubeconfig` file by default.
 
-```bash
+```shell
 microk8s config | tee ~/.kube/config
 ```
 
 Now let's enable a few add-ons for using features like role based access control, usage of local volume for storage, and load balancing.
 
-```bash
+```shell
 sudo microk8s enable rbac
 sudo microk8s enable storage hostpath-storage
 
@@ -143,7 +155,7 @@ sudo microk8s enable metallb:$IP_ADDR_START-$IP_ADDR_END
 
 Wait for the commands to finish running and check the list of enabled add-ons:
 
-```bash
+```shell
 microk8s status --wait-ready
 ```
 
@@ -171,19 +183,19 @@ The MicroK8s setup is complete.
 For Apache Spark jobs to be running run on top of Kubernetes, a set of resources (ServiceAccount, associated Roles, RoleBindings etc.) need to be created and configured.
 To simplify this task, the Charmed Apache Spark solution offers the `spark-client` snap. Install the snap: 
 
-```bash
+```shell
 sudo snap install spark-client --channel 3.4/edge
 ```
 
 Let's create a Kubernetes namespace for us to use as a playground in this tutorial.
 
-```bash
+```shell
 kubectl create namespace spark
 ```
 
 We will now create a ServiceAccount that will be used to run the Spark jobs. The creation of the ServiceAccount can be done using the `spark-client` snap, which will create necessary Roles, RoleBindings and other necessary configurations along with the creation of the ServiceAccount:
 
-```bash
+```shell
 spark-client.service-account-registry create \
   --username spark --namespace spark
 ```
@@ -192,7 +204,7 @@ This command does a number of things in the background. First, it creates a Serv
 
 These resources can be viewed with `kubectl get` commands as follows:
 
-```bash
+```shell
 kubectl get serviceaccounts -n spark
 kubectl get roles -n spark
 kubectl get rolebindings -n spark
@@ -257,7 +269,7 @@ We'll use `juju` to deploy and manage the Spark History Server and a number of o
 
 To install and configure a `juju` client using a snap:
 
-```bash
+```shell
 sudo snap install juju 
 mkdir -p ~/.local/share
 ```
@@ -280,18 +292,20 @@ microk8s   1        localhost  k8s   0            built-in  A Kubernetes Cluster
 As you can see, Juju has detected LXD as well as K8s installation in the system.
 For us to be able to deploy Kubernetes charms, let's bootstrap a Juju controller in the `microk8s` cloud:
 
-```bash
+```shell
 juju bootstrap microk8s spark-tutorial
 ```
 
+<!-- test:wait --seconds 30 -->
+
 The creation of the new controller can be verified with the `juju controllers` command.
 The output of the command should be similar to:
 
 ```text
 Use --refresh option with this command to see the latest information.
 
 Controller       Model  User   Access     Cloud/Region        Models  Nodes  HA  Version
-spark-tutorial*  -      admin  superuser  microk8s/localhost       1      1   -  3.6.14  
+spark-tutorial*  -      admin  superuser  microk8s/localhost       1      1   -  3.6.21  
 ```
 
 The Juju setup is complete.
@@ -304,16 +318,18 @@ It is available as a MicroK8s [add-on](https://microk8s.io/docs/addon-minio) by
 
 Let's enable the MinIO add-on for MicroK8s.
 
-```bash
+```shell
 sudo microk8s enable minio
 ```
 
+<!-- test:wait --seconds 60 -->
+
 Authentication with MinIO is managed with an access key and a secret key. 
 These credentials are generated and stored as Kubernetes secret when the MinIO add-on is enabled.
 
 Let's fetch credentials and export them as environment variables in order to use them later:
 
-```bash
+```shell
 export ACCESS_KEY=$(kubectl get secret -n minio-operator microk8s-user-1 -o jsonpath='{.data.CONSOLE_ACCESS_KEY}' | base64 -d)
 export SECRET_KEY=$(kubectl get secret -n minio-operator microk8s-user-1 -o jsonpath='{.data.CONSOLE_SECRET_KEY}' | base64 -d)
 export S3_ENDPOINT=$(kubectl get service minio -n minio-operator -o jsonpath='{.spec.clusterIP}')
@@ -324,7 +340,7 @@ The MinIO add-on offers access to a built-in Web UI which can be used to interac
 
 To set up the AWS CLI, run the following commands:
 
-```bash
+```shell
 sudo snap install aws-cli --classic
 
 aws configure set aws_access_key_id $ACCESS_KEY 
@@ -335,6 +351,13 @@ aws configure set endpoint_url "http://$S3_ENDPOINT"
 
 Check the tool by listing all S3 buckets:
 
+<!-- test:run
+# Retry aws s3 ls until MinIO is ready (may take a moment after addon enable)
+for i in $(seq 1 12); do
+  aws s3 ls && break || sleep 10
+done
+-->
+
 ```bash
 aws s3 ls
 ```
@@ -349,14 +372,14 @@ Let's proceed to create a new one.
 
 To create the `spark-tutorial` bucket using AWS CLI, run:
 
-```bash
+```shell
 aws s3 mb s3://spark-tutorial
 ```
 
 We now have an S3 bucket available locally on our system!
 See for yourself by running the same command to list all buckets:
 
-```bash
+```shell
 aws s3 ls
 ```
 
@@ -370,7 +393,7 @@ In the Charmed Apache Spark solution, these configurations are stored in a Secre
 
 The S3 configurations can be added to the existing `spark` service account with the following command:
 
-```bash
+```shell
 spark-client.service-account-registry add-config \
   --username spark --namespace spark \
   --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider \
@@ -383,7 +406,7 @@ spark-client.service-account-registry add-config \
 
 Now check the list of configurations bound for the service account:
 
-```bash
+```shell
 spark-client.service-account-registry get-config \
   --username spark --namespace spark 
 ```
@@ -403,7 +426,7 @@ spark.kubernetes.namespace=spark
 
 You can also see the configuration stored in a Kubernetes secret:
 
-```bash
+```shell
 kubectl get secret -n spark -o yaml
 ```
 
@@ -444,33 +467,38 @@ With that, the basic environment setup is complete!
 Throughout this tutorial you will create service accounts in several Kubernetes namespaces.
 Running `add-config` to push S3 credentials to every new account manually would quickly become repetitive.
 The [Integration Hub for Apache Spark](https://charmhub.io/spark-integration-hub-k8s) automates this:
-once deployed and integrated with an S3-compatible storage, it automatically pushes the storage credentials
+once deployed and configured with the `monitored-service-accounts` option, it automatically pushes the storage credentials
 to every service account that is managed by `spark-client` — including accounts created in the future.
 
 Let's deploy it now so that service accounts we create in later steps receive S3 credentials automatically.
 
 Create a dedicated Juju model for the Integration Hub:
 
-```bash
+```shell
 juju add-model spark-integration-hub
 ```
 
 Deploy the Integration Hub charm and an `s3-integrator` charm to supply it with the storage configuration:
 
-```bash
+```shell
 juju deploy spark-integration-hub-k8s --channel 3/stable --trust
+juju config spark-integration-hub-k8s monitored-service-accounts="*:*"
 juju deploy s3-integrator --channel 1/stable
 juju config s3-integrator bucket=spark-tutorial path=spark-events endpoint=http://$S3_ENDPOINT
 ```
 
+<!-- test:await-idle --timeout 600 --allow-blocked s3-integrator -->
+
 The `s3-integrator` will remain in `blocked` state until S3 credentials are provided. Set the credentials first, then integrate:
 
-```bash
+```shell
 juju run s3-integrator/leader sync-s3-credentials \
   access-key=$ACCESS_KEY secret-key=$SECRET_KEY
 juju integrate s3-integrator spark-integration-hub-k8s
 ```
 
+<!-- test:await-idle --timeout 600 -->
+
 Wait for both charms to reach `active/idle` status:
 
 ```bash
@@ -479,7 +507,7 @@ watch juju status --color
 
 Verify that the Integration Hub has automatically updated the `spark` service account:
 
-```bash
+```shell
 spark-client.service-account-registry get-config \
   --username spark --namespace spark
 ```
@@ -491,11 +519,16 @@ the S3 credentials automatically.
 
 Create the `spark-events` and `warehouse` directories in S3:
 
-```bash
+```shell
 aws s3api put-object --bucket spark-tutorial --key spark-events/
 aws s3api put-object --bucket spark-tutorial --key warehouse/
 ```
 
+<!-- test:assert
+spark-client.service-account-registry get-config --username spark --namespace spark | grep -q "fs.s3a.endpoint"
+juju status -m spark-integration-hub --format=json | jq -e '.applications."spark-integration-hub-k8s"."application-status".current == "active"'
+-->
+
 ## (Optional) Create a snapshot
 
 At this stage, you may want to create a [snapshot](https://documentation.ubuntu.com/multipass/en/latest/reference/command-line-interface/snapshot/#snapshot) of the current state, for which you need to stop the Multipass VM. Exit the VM by pressing `CTRL + D` and stop it: