docs: Home page remodeling (#181)

izmalk · theoctober19th · web-flow · commit bb968754bf4e · 2026-01-14T17:31:05.000Z
* Home page remodeling
* Replaced Multipass broken links
Signed-off-by: Vladimir Izmalkov &lt;48120135+izmalk@users.noreply.github.com&gt;
Co-authored-by: Bikalpa Dhakal &lt;theoctober19th@gmail.com&gt;
diff --git a/docs/.custom_wordlist.txt b/docs/.custom_wordlist.txt
@@ -81,6 +81,15 @@ kubeconfig
 CVEs
 Canonical's
 metastore
+DBeaver
+lakehouse
+serverless
+Kaggle
+GPUs
+Kubeconfig
+ConfigMaps
+plaintext
+databag
 serverless
 performant
 configmap
diff --git a/docs/explanation/security.md b/docs/explanation/security.md
@@ -177,7 +177,7 @@ Charmed Apache Spark K8s provides external authentication capabilities for:
 Authentication to the Kubernetes API follows standard implementations, as described in the [upstream Kubernetes documentation](https://kubernetes.io/docs/reference/access-authn-authz/authentication/). 
 Please make sure that the distribution being used supports the authentication used by clients, and that the Kubernetes cluster has been correctly configured. 
 
-Generally, client applications store credentials information locally in a `KUBECONFIG` file. 
+Generally, client applications store credentials information locally in a `kubeconfig` file. 
 On the other hand, pods created by the charms and the Spark Job workloads receive credentials via shared secrets, mounted to the default locations `/var/run/secrets/kubernetes.io/serviceaccount/`. 
 See the [upstream documentation](https://kubernetes.io/docs/tasks/run-application/access-api-from-pod/#directly-accessing-the-rest-api) for more information.
 
diff --git a/docs/how-to/deploy/environment.md b/docs/how-to/deploy/environment.md
@@ -98,7 +98,7 @@ Make sure that the MicroK8s cluster is now up and running:
 microk8s status --wait-ready
 ```
 
-Export the Kubernetes config file associated with admin rights and store it in the $KUBECONFIG file, e.g. `~/.kube/config`: 
+Export the Kubernetes config file associated with admin rights and store it in the `$KUBECONFIG` file, e.g. `~/.kube/config`: 
 
 ```bash
 export KUBECONFIG=path/to/file # Usually ~/.kube/config
@@ -113,9 +113,9 @@ microk8s.enable dns rbac storage hostpath-storage
 
 The MicroK8s cluster is now ready to be used.
 
-#### External LoadBalancer
+#### External load balancer
 
-If you want to expose the Spark History Server UI via a Traefik ingress, we need to enable an external loadbalancer:
+If you want to expose the Spark History Server UI via a Traefik ingress, we need to enable an external load balancer:
 
 ```bash
 IPADDR=$(ip -4 -j route get 2.2.2.2 | jq -r '.[] | .prefsrc')
@@ -185,13 +185,13 @@ You can then create the EKS via CLI:
 eksctl create cluster -f cluster.yaml
 ```
 
-The EKS cluster creation process may take several minutes. The cluster creation process should already update the `KUBECONFIG` file with the new cluster information. By default, `eksctl` creates a user that generates a new access token on the fly via the `aws` CLI. However, this conflicts with the `spark-client` snap that is strictly confined and does not have access to the `aws` command. Therefore, we recommend you to manually retrieve a token:
+The EKS cluster creation process may take several minutes. The cluster creation process should already update the `kubeconfig` file with the new cluster information. By default, `eksctl` creates a user that generates a new access token on the fly via the `aws` CLI. However, this conflicts with the `spark-client` snap that is strictly confined and does not have access to the `aws` command. Therefore, we recommend you to manually retrieve a token:
 
 ```bash
 aws eks get-token --region <AWS_REGION_NAME> --cluster-name spark-cluster --output json
 ```
 
-and paste the token in the KUBECONFIG file:
+and paste the token in the `kubeconfig` file:
 
 ```yaml
 users:
@@ -302,9 +302,9 @@ terraform output
 # resource_group_name = "TestSparkAKSRG"
 ```
 
-#### Generating Kubeconfig file
+#### Generating the Kubeconfig file
 
-To generate the Kubeconfig file for connecting the client to the newly created cluster:
+To generate the `kubeconfig` file for connecting the client to the newly created cluster:
 
 ```bash
 az aks get-credentials --resource-group <resource_group_name> --name <aks_cluster_name> --file ~/.kube/config
@@ -437,7 +437,7 @@ The RadosGW API can then be reached at `<hostname>:<port>`, where `hostname` is
 
 #### MicroK8s MinIO
 
-If you have already a MicroK8s cluster running, you can enable the MinIO storage with the dedicated addon
+If you have already a MicroK8s cluster running, you can enable the MinIO storage with the dedicated add-on
 
 ```shell
 microk8s.enable minio
diff --git a/docs/how-to/manage-service-accounts/using-python.md b/docs/how-to/manage-service-accounts/using-python.md
@@ -15,7 +15,7 @@ Furthermore, you need to make sure that `PYTHONPATH` contains the location where
 
 The following snipped allows you to import relevant environment variables
 into a confined object, among which there should an auto-inference of your 
-kubeconfig file location. 
+`kubeconfig` file location. 
 
 ```python
 import os
diff --git a/docs/how-to/spark-history-server/expose-web-gui.md b/docs/how-to/spark-history-server/expose-web-gui.md
@@ -19,7 +19,7 @@ IP=$(kubectl get pod spark-history-server-k8s-0 -n spark --template '{{.status.p
 
 ## With Ingress
 
-The Spark History server can be exposed outside a K8s cluster by means of an ingress. This is the recommended way in production for any K8s distribution. Exposing Kubernetes services through an ingress generally requires the cloud provider/infrastrucutre to have an external load balancer integrated with the Kubernetes cluster. Most cloud providers (such as AWS, Google and Azure) provide this integration out-of-the-box. If you are running on MicroK8s, make sure that you have enabled `metallb`, as shown in the "How-To Setup K8s" user guide. 
+The Spark History server can be exposed outside a K8s cluster by means of an ingress. This is the recommended way in production for any K8s distribution. Exposing Kubernetes services through an ingress generally requires the cloud provider/infrastrucutre to have an external load balancer integrated with the Kubernetes cluster. Most cloud providers (such as AWS, Google and Azure) provide this integration out-of-the-box. If you are running on MicroK8s, make sure that you have enabled `metallb`, as shown in the "How-To Setup K8s" user guide.
 
 Spark History server can be exposed outside of the K8s cluster using `traefik-k8s` charm. 
 If COS is enabled, you can use the ingress already provided as part of the COS bundle. Otherwise, you can deploy one using 
diff --git a/docs/how-to/use-gpu.md b/docs/how-to/use-gpu.md
@@ -5,7 +5,7 @@ The Charmed Apache Spark solution offers an OCI image that supports the [Apache
 
 ## Setup
 
-After installing [spark-client](https://snapcraft.io/spark-client) and [Microk8s](https://microk8s.io/) with the GPU addon enabled, now we can look into how to launch Spark jobs with GPU in Kubernetes.
+After installing [spark-client](https://snapcraft.io/spark-client) and [Microk8s](https://microk8s.io/) with the GPU add-on enabled, now we can look into how to launch Spark jobs with GPU in Kubernetes.
 
 First, we need to create a pod template to limit the amount of GPU per container.
 
diff --git a/docs/index.md b/docs/index.md
@@ -25,8 +25,19 @@ easy-to-use application integration, and monitoring.
 
 | | |
 |--|--|
-| [Tutorial](tutorial-introduction)</br> Learn how to use Charmed Apache Spark with our step-by-step guidance. Get started from [step one](tutorial-1-environment-setup). </br> | [How-to guides](how-to-deploy-index) </br> Practical instructions for key tasks, like [deploy](how-to-deploy-index), [manage service accounts](how-to-manage-service-accounts-index), [monitor metrics](how-to-monitoring), [process streams](how-to-streaming-jobs), [use GPU](how-to-use-gpu). |
-| [Reference](reference-index) </br> Technical information, for example: [release notes](reference-releases-index), [system requirements](reference-requirements), and [contact information](reference-contacts). | [Explanation](explanation-index) </br> Explore and grow your understanding of key topics, such as: [security](explanation-security), [cryptography](explanation-cryptography), [solution components](explanation-component-overview), [configuration](explanation-configuration), and [monitoring](explanation-monitoring). |
+| **Tutorial** | [Introduction](tutorial-introduction) • [Step 1: Environment setup](tutorial-1-environment-setup) |
+| **Deployment** | [Environment setup](how-to-deploy-environment) • [Charmed Apache Spark](how-to-deploy-spark) • [Charmed Apache Kyuubi](how-to-deploy-kyuubi) • [Requirements](reference-requirements) |
+| **Service account management** | [Integration hub](how-to-service-accounts-integration-hub) • [Python](how-to-service-accounts-python) • [Spark-client](how-to-service-accounts-spark-client) |
+| **Operations** | [Monitoring](how-to-monitoring) • Spark History Server: [Auth](how-to-spark-history-server-auth) and [web GUI](how-to-spark-history-server-expose-web-gui) • [Use K8s pods](how-to-use-k8s-pods) • [Streaming jobs](how-to-streaming-jobs) • [Use GPUs](how-to-use-gpu) |
+| **Apache Kyuubi** | [External connections](how-to-apache-kyuubi-external-connections) • [Integrate](how-to-apache-kyuubi-integrate-with-applications) • [Metastore](how-to-apache-kyuubi-external-metastore) • [Backups](how-to-apache-kyuubi-back-up-and-restore) • [Upgrades](how-to-apache-kyuubi-upgrade) • [GPU support](how-to-apache-kyuubi-gpu) |
+| **Security** | [Overview](explanation-security) • [Enable encryption (Apache Kyuubi)](how-to-apache-kyuubi-encryption-and-passwords) • [Cryptography](explanation-cryptography) • [Self-signed certificates](how-to-self-signed-certificates) |
+
+## How the documentation is organised
+
+[Tutorial](tutorial-introduction): For new users needing to learn how to use Charmed Apache Kafka <br>
+[How-to guides](how-to-index): For users needing step-by-step instructions to achieve a practical goal <br>
+[Reference](reference-index): For precise, theoretical, factual information to be used while working with the charm <br>
+[Explanation](explanation-index): For deeper understanding of key Charmed Apache Kafka concepts <br>
 
 ## Project and community
 
diff --git a/docs/reference/contacts.md b/docs/reference/contacts.md
@@ -18,4 +18,4 @@ Please do NOT file GitHub issues on security topics.
 * [Charmed Apache Kafka](https://charmhub.io/kafka)
 * [Git sources for Charmed Apache Spark](https://github.com/canonical/spark-k8s-bundle)
 * [Canonical Data on Launchpad](https://launchpad.net/~data-platform)
-* [Canonical Data on Matrix](https://matrix.to/#/#charmhub-data-platform:ubuntu.com)
+* [Canonical Data on Matrix](https://matrix.to/#/#charmhub-data-platform:ubuntu.com)
diff --git a/docs/tutorial/1-environment-setup.md b/docs/tutorial/1-environment-setup.md
@@ -31,8 +31,8 @@ multipass launch --cpus 4 --memory 8G --disk 50G --name spark-tutorial 24.04
 ```{note}
 See also:
 
-* [How to create an instance](https://canonical.com/multipass/docs/create-an-instance#create-an-instance-with-a-specific-image) guide from Multipass documentation
-* [`multipass launch` command reference](https://canonical.com/multipass/docs/launch-command)
+* [How to create an instance](https://documentation.ubuntu.com/multipass/latest/how-to-guides/manage-instances/create-an-instance/#create-an-instance-with-a-specific-image) guide from Multipass documentation
+* [`multipass launch` command reference](https://documentation.ubuntu.com/multipass/latest/reference/command-line-interface/launch/)
 ```
 
 Check the status of the provisioned virtual machine:
@@ -116,13 +116,13 @@ addons:
 ```
 
 Let's generate a Kubernetes configuration file using MicroK8s and write it to `~/.kube/config`. 
-This is where `kubectl` looks for the Kubeconfig file by default.
+This is where `kubectl` looks for the `kubeconfig` file by default.
 
 ```bash
 microk8s config | tee ~/.kube/config
 ```
 
-Now let's enable a few addons for using features like role based access control, usage of local volume for storage, and load balancing.
+Now let's enable a few add-ons for using features like role based access control, usage of local volume for storage, and load balancing.
 
 ```bash
 sudo microk8s enable rbac
@@ -133,7 +133,7 @@ IPADDR=$(ip -4 -j route get 2.2.2.2 | jq -r '.[] | .prefsrc')
 sudo microk8s enable metallb:$IPADDR-$IPADDR
 ```
 
-Wait for the commands to finish running and check the list of enabled addons:
+Wait for the commands to finish running and check the list of enabled add-ons:
 
 ```bash
 microk8s status --wait-ready
@@ -293,7 +293,7 @@ Apache Spark can be configured to use S3 for object storage.
 However, for this tutorial, instead of AWS S3, we'll use [MinIO](https://min.io/): a lightweight S3-compatible object storage.
 It is available as a MicroK8s [add-on](https://microk8s.io/docs/addon-minio) by default, allowing us to create a local S3 bucket, which is more convenient for our local tests.
 
-Let's enable the MinIO addon for MicroK8s.
+Let's enable the MinIO add-on for MicroK8s.
 
 ```bash
 sudo microk8s enable minio
diff --git a/docs/tutorial/2-distributed-data-processing.md b/docs/tutorial/2-distributed-data-processing.md
@@ -145,7 +145,7 @@ sudo apt install zip
 unzip twitter.zip
 ```
 
-This archive unpacks a directory called `twcs` with a single csv file of the same in it.
+This archive unpacks a directory called `twcs` with a single `CSV` file of the same name in it.
 Let's upload it to our S3 storage:
 
 ```bash
@@ -170,15 +170,15 @@ spark-client.pyspark --username spark --namespace spark
 
 For distributed and parallel data processing Apache Spark actively uses the concept of a [resilient distributed dataset (RDD)](https://spark.apache.org/docs/latest/rdd-programming-guide.html#resilient-distributed-datasets-rdds), which is a fault-tolerant collection of elements that can be operated on in parallel across the nodes of the cluster.
 
-Read CSV from S3 and create an RDD from our sample dataset:
+Read `CSV` from S3 and create an RDD from our sample dataset:
 
 ```python
 rdd = spark.read.csv("s3a://spark-tutorial/twitter.csv", header=True).rdd
 ```
 
 Now that RDD can be used for parallel processing by multiple Apache Spark executors.
 
-Count the number of tweets (lines in CSV) with "text" field containing "Ubuntu" in a case insensitive way:
+Count the number of tweets (lines in `CSV`) with "text" field containing "Ubuntu" in a case insensitive way:
 
 ```python
 from operator import add