Skip to content

Commit f856835

Browse files
committed
Preparing v1.5 release
1 parent ad25d56 commit f856835

3 files changed

Lines changed: 232 additions & 2 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ The following picture shows the set of opensource solutions used so far in the c
3030

3131
## Cluster architecture and hardware
3232

33-
Home lab architecture, showed in the picture below, consist of a Kubernetes cluster of 4 nodes (1 master and 3 workers) and a firewall, built with another Raspberry PI, to isolate cluster network from your home network.
33+
Home lab architecture, showed in the picture below, consist of a Kubernetes cluster of 5 nodes (1 master and 4 workers) and a firewall, built with another Raspberry PI, to isolate cluster network from your home network.
3434

3535

3636
<p align="center">

docs/_docs/certmanager.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ CertManager is configured to deploy in the cluster a private PKI (Public Key Inf
4040

4141
Such private PKI will be used internally by Linkerd to issue certiticates to each POD to implement mTLS communictions.
4242

43-
CertManager also is configured to deliver valid certificates, using your own DNS domain, through its integration with Let's Encrypt using ACM DNS challenges. Configuration is provided for using IONOS DNS provider, using developer API available to automate challenge resolution. Similar configuration can be implemented for other supported DNS providers.
43+
CertManager also is configured to deliver valid certificates, using your own DNS domain, through its integration with Let's Encrypt using ACME DNS challenges. Configuration is provided for using IONOS DNS provider, using developer API available to automate challenge resolution. Similar configuration can be implemented for other supported DNS providers.
4444

4545
Valid certificates signed by Letscript will be used for cluster exposed services.
4646

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
---
2+
layout: post
3+
title: Kubernetes Pi Cluster relase v1.5
4+
date: 2022-10-12
5+
author: ricsanfre
6+
---
7+
8+
Today I am pleased to announce the fifth release of Kubernetes Pi Cluster project (v1.5).
9+
10+
Main features/enhancements of this release are:
11+
12+
13+
## Let's Encrypt certificates integration
14+
15+
Adding Let's Encrypt integration in CertManager to generate automatically valid TLS certificates.
16+
17+
CertManager is configured to deliver valid certificates through its integration with Let's Encrypt using ACME DNS challenges. ACME HTTPS challenge, also supported by CertManager-LetsEncrypt, is not configured since it requires to expose the cluster services to the public internet.
18+
19+
Configuration is provided for using IONOS DNS provider, using developer API available to automate challenge resolution and [IONOS cert-manager webhook](https://github.com/fabmade/cert-manager-webhook-ionos).
20+
21+
Similar configuration can be implemented for other supported DNS providers. See supported list and further documentation in [Certmanager documentation: "ACME DNS01" ](https://cert-manager.io/docs/configuration/acme/dns01/).
22+
23+
Valid certificates signed by Letscript are used for cluster exposed services. For internal services, like Linkerd, self-signed certificates are used.
24+
25+
[Cerbot](https://certbot.eff.org/) and [certbot-dns-ionos plugin](https://github.com/helgeerbe/certbot-dns-ionos) installation details are also provided to generate Let's Encrypt certificates outside the cluster, using the same ACME DNS challenge.
26+
27+
28+
## Adding CSI Snapshot support
29+
30+
Enabling within K3S cluster the new Kubernetes CSI feature: [Volume Snapshots](https://kubernetes.io/docs/concepts/storage/volume-snapshots/) to be able to programmatically create backups and so orchestrate consistent backups within Velero
31+
32+
CSI Snapshot feature is supported by Longhorn and Velero. See Longhorn documentation: [CSI Snapshot Support](https://longhorn.io/docs/1.2.2/snapshots-and-backups/csi-snapshot-support/create-a-backup-via-csi/) and [Velero CSI Snapshots documentation](https://velero.io/docs/v1.9/csi/).
33+
34+
K3S currently does not come with a preintegrated Snapshot Controller, needed to enable CSI Snapshot functionallity. An [external snapshot controller](https://github.com/kubernetes-csi/external-snapshotter) has been deployed.
35+
36+
## Prometheus memory footprint optimization
37+
38+
Memory footprint reduction is achieved by removing all metrics duplicates from K3S monitoring. See details in [issue #67](https://github.com/ricsanfre/pi-cluster/issues/67)
39+
40+
Before the optimization, K3S duplicates came from monitoring kube-proxy, kubelet and apiserver components. kube-controller-manager and kube-scheduler monitoring was already removed in the past. See [issue #22](https://github.com/ricsanfre/pi-cluster/issues/22)
41+
42+
**Before removing K3S duplicates**:
43+
44+
| Active Series | Memory Usage |
45+
|:---:|:---:|
46+
| ![Prometheus_Active_series_before](https://user-images.githubusercontent.com/84853324/187235196-15aa874d-7ffe-434e-b14a-1c2a41364b79.png) | ![Prometheus_memory_before](https://user-images.githubusercontent.com/84853324/187235370-75064b56-ce58-4f4a-92a1-5d52d429d58c.png) |
47+
48+
49+
Number of active time series: 157k
50+
51+
Memory usage: 1GB
52+
53+
**After removing duplicates**
54+
55+
| Active Series | Memory Usage |
56+
|:---:|:---:|
57+
![Prometheus_Active_series_after](https://user-images.githubusercontent.com/84853324/187251837-6b49bc30-29a3-436f-9627-a86ecbb48f37.png) | ![Prometheus_memory_after](https://user-images.githubusercontent.com/84853324/187251961-7eae10e5-bc04-4375-94da-49680654e4c9.png) |
58+
59+
Number of active time series: 73k
60+
61+
Memory usage: 550 MB
62+
63+
Number of active time series has been reduced from 150k to 73k ( 50% reduction) and memory consumption has be reduced from 1GB to 550 MB (50% reduction)
64+
65+
66+
## Upgrade Linkerd to version 2.12
67+
68+
Upgrade Linkerd to the latest stable version, 2.12, released in Aug. See this [linkerd announcement](https://buoyant.io/blog/announcing-linkerd-2-12).
69+
70+
New features of release 2.12:
71+
- Per-route polices
72+
- [Kubernetes Gateway API](https://gateway-api.sigs.k8s.io/) support
73+
- Access logging
74+
75+
Installation procedure in this release is completely different to previous releases.
76+
77+
78+
## Ansible Playbooks Improvements
79+
80+
### Encrypt passwords and keys used in playbooks with Ansible Vault
81+
82+
Encrypt all passwords/keys that previously were stored in plain-text within ansible variables. [Ansible Vault](https://docs.ansible.com/ansible/latest/user_guide/vault.html) is used.
83+
84+
85+
Solution implemented:
86+
87+
- Include all secrets, keys in a specific var yaml file: `vautl.yml` located in `vars` directory.
88+
89+
```yml
90+
---
91+
# Encrypted variables - Ansible Vault
92+
vault:
93+
# SAN
94+
san:
95+
iscsi:
96+
node_pass: s1cret0
97+
password_mutual: 0tr0s1cret0
98+
# K3s secrets
99+
k3s:
100+
k3s_token: s1cret0
101+
# traefik secrets
102+
traefik:
103+
basic_auth_passwd: s1cret0
104+
# Minio S3 secrets
105+
minio:
106+
root_password: supers1cret0
107+
longhorn_key: supers1cret0
108+
velero_key: supers1cret0
109+
restic_key: supers1cret0
110+
# elastic search
111+
elasticsearch:
112+
admin_password: s1cret0
113+
# Fluentd
114+
fluentd:
115+
shared_key: s1cret0
116+
# Grafana
117+
grafana:
118+
admin_password: s1cret0
119+
```
120+
121+
- Encrypt the file with Ansible vault
122+
123+
```shell
124+
ansible-vault encrypt vault.yml
125+
```
126+
127+
Provide ansible vault password to encrypt the file.
128+
129+
The file can be decrypted using the following command
130+
131+
```shell
132+
ansible-vault decrypt vault.yml
133+
```
134+
135+
- Reference the vault variables in playbooks, group_vars, etc.
136+
137+
For example in: k3s_cluster group variables.
138+
139+
```yml
140+
# k3s shared token
141+
k3s_token: "{{ vault.k3s.k3s_token }}"
142+
```
143+
144+
All referenced variables that are encrypted by ansible vault belong to `vault` yaml dictionary, so they can be clearly identified and their values located in `vault.yml` file.
145+
146+
- Include task to load vault variables file in each playbook's pre-task section:
147+
148+
```yml
149+
- name: my_playbook
150+
hosts: my_server
151+
pre_tasks:
152+
- name: Include vault variables
153+
include_vars: "vars/vault.yml"
154+
tags: ["always"]
155+
roles:
156+
....
157+
```
158+
159+
- Execute ansible playbooks with `--ask-vault-pass` argument, so the password used to encrypt vault file can be provided when starting the playbook.
160+
161+
```shell
162+
ansible-playbook my-playbook.yml --ask-vault-pass
163+
```
164+
165+
### Automatic provision of Prometheus Rules from yaml files
166+
167+
Automation of creation of `PrometheusRule` resources, used by PrometheusOperator, to configure Prometheus rules. Individual rules, defined as yaml files.
168+
169+
Functionality for automatically provision Grafana Dashboards, json files, located within a directory (`dashboards`) has been replicated. Prometheus rules, in yaml format, located in `rules` directory will be used to create `PrometheusRule` objects.
170+
171+
## Upgrade software components to latest stable version
172+
173+
174+
| Type | Software | Latest Version tested | Notes |
175+
|-----------| ------- |-------|----|
176+
| OS | Ubuntu | 20.04.3 | OS need to be tweaked for Raspberry PI when booting from external USB |
177+
| Control | Ansible | 2.12.1 | |
178+
| Control | cloud-init | 21.4 | version pre-integrated into Ubuntu 20.04 |
179+
| Kubernetes | K3S | v1.24.6 | K3S version|
180+
| Kubernetes | Helm | v3.6.3 ||
181+
| Metrics | Kubernetes Metrics Server | v0.5.2 | version pre-integrated into K3S |
182+
| Computing | containerd | v1.6.8-k3s1 | version pre-integrated into K3S |
183+
| Networking | Flannel | v0.19.2 | version pre-integrated into K3S |
184+
| Networking | CoreDNS | v1.9.1 | version pre-integrated into K3S |
185+
| Networking | Metal LB | v0.13.5 | Helm chart version: metallb-0.13.5 |
186+
| Service Mesh | Linkerd | v2.12.1 | Helm chart version: linkerd-control-plane-1.9.3 |
187+
| Service Proxy | Traefik | v2.9.1 | Helm chart: traefik-13.0.0 |
188+
| Storage | Longhorn | v1.3.1 | Helm chart version: longhorn-1.3.1 |
189+
| SSL Certificates | Certmanager | v1.9.1 | Helm chart version: cert-manager-v1.9.1 |
190+
| Logging | ECK Operator | 2.4.0 | Helm chart version: eck-operator-2.4.0 |
191+
| Logging | Elastic Search | 8.1.2 | Deployed with ECK Operator |
192+
| Logging | Kibana | 8.1.2 | Deployed with ECK Operator |
193+
| Logging | Fluentbit | 1.9.9 | Helm chart version: fluent-bit-0.20.9 |
194+
| Logging | Fluentd | 1.15.2 | Helm chart version: 0.3.9. [Custom docker image](https://github.com/ricsanfre/fluentd-aggregator) from official v1.15.2|
195+
| Monitoring | Kube Prometheus Stack | 0.60.1 | Helm chart version: kube-prometheus-stack-41.0.0 |
196+
| Monitoring | Prometheus Operator | 0.59.2 | Installed by Kube Prometheus Stack. Helm chart version: kube-prometheus-stack-41.0.0 |
197+
| Monitoring | Prometheus | 2.39 | Installed by Kube Prometheus Stack. Helm chart version: kube-prometheus-stack-41.0.0 |
198+
| Monitoring | AlertManager | 0.24 | Installed by Kube Prometheus Stack. Helm chart version: kube-prometheus-stack-41.0.0 |
199+
| Monitoring | Grafana | 9.1.7 | Helm chart version grafana-6.32.10. Installed as dependency of Kube Prometheus Stack chart. Helm chart version: kube-prometheus-stack-41.0.0 |
200+
| Monitoring | Prometheus Node Exporter | 1.3.1 | Helm chart version: prometheus-node-exporter-3.3.1. Installed as dependency of Kube Prometheus Stack chart. Helm chart version: kube-prometheus-stack-41.0.0 |
201+
| Monitoring | Prometheus Elasticsearch Exporter | 1.5.0 | Helm chart version: prometheus-elasticsearch-exporter-4.15.0 |
202+
| Backup | Minio | RELEASE.2022-09-22T18-57-27Z | |
203+
| Backup | Restic | 0.12.1 | |
204+
| Backup | Velero | 1.9.2 | Helm chart version: velero-2.31.9 |
205+
{: .table }
206+
207+
208+
## Release v1.5.0 Notes
209+
210+
Upgrade backup service adding Kubernetes CSI Snapshot feature, Prometheus memory optimization removing K3S duplicate metrics, enabling Let's Encrypt TLS certificates, and upgrading Linkerd to release 2.12.
211+
212+
### Release Scope:
213+
214+
- Use of Let's Encrypt TLS certificates
215+
- Certmanager configuration of Let's Encrypt support. ACME DNS01 challenge provider
216+
- Certbot deployment
217+
- IONOS DNS provider integration
218+
- Upgrade backup service adding CSI Snapshot support
219+
- Enable Kubernetes CSI Snapshot feature, installing external snapshot controller.
220+
- Configure Longhorn CSI Snapshots support
221+
- Configure Velero CSI Snapshot support
222+
- Prometheus memory footprint optimization
223+
- Removing of duplicate metrics coming from K3S endpoints.
224+
- Upgrade Linkerd to version 2.12
225+
- Ansible Playbooks improvements
226+
- Encrypt passwords and keys used in playbooks with Ansible Vault
227+
- Automatic provsion of Prometheus Rules from yaml files.
228+
229+
230+

0 commit comments

Comments
 (0)