Skip to content

Commit df64f6e

Browse files
committed
docs: Update Fast Track and IP reuse documentation
Update the Metal3 book with Fast Track provisioning and IP reuse documentation from CAPM3 PR #3111. Changes: - Update fast_track.md with behavior matrix, configuration details, PR #3106 DisablePowerOff+AutomatedCleaningMode behavior, and corrected log message - Update ip_reuse.md with correct flag names, variables: key in clusterctl config, accurate Metal3Data labels, and fixed troubleshooting section - Add both pages to SUMMARY.md and features.md Related: metal3-io/cluster-api-provider-metal3#3111 Signed-off-by: Maximilian Rink <maximilian.rink@telekom.de>
1 parent 79a8274 commit df64f6e

4 files changed

Lines changed: 342 additions & 0 deletions

File tree

docs/user-guide/src/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,8 @@
5959
- [Annotation-based IPPool](capm3/annotation_based_ippool.md)
6060
- [Controller pod placement](capm3/pod_placement.md)
6161
- [ProviderID Workflow](capm3/providerid-workflow.md)
62+
- [Fast Track](capm3/fast_track.md)
63+
- [IP Reuse](capm3/ip_reuse.md)
6264
- [Ip-address-manager](ipam/introduction.md)
6365
- [Install Ip-address-manager](ipam/ipam_installation.md)
6466
- [Troubleshooting FAQ](troubleshooting.md)
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Fast Track Mode
2+
3+
## Overview
4+
5+
Fast Track mode is an optimization feature in CAPM3 that allows BareMetalHosts
6+
to skip the deprovisioning power-off cycle during machine deletion when certain
7+
conditions are met. This can significantly speed up cluster upgrades and node
8+
replacements by keeping hosts powered on and ready for immediate re-use.
9+
10+
## How It Works
11+
12+
When a Metal3Machine is deleted, CAPM3 normally instructs the BareMetalHost to
13+
power off and deprovision. With Fast Track mode enabled, CAPM3 can keep the host
14+
powered on if automated cleaning is configured to only clean metadata.
15+
16+
The behavior is controlled by three factors:
17+
18+
1. **BareMetalHost.Spec.DisablePowerOff**: If `true`, the host always stays
19+
online (highest priority)
20+
1. **CAPM3_FAST_TRACK environment variable**: Set to `true` or `false`
21+
1. **BareMetalHost.Spec.AutomatedCleaningMode**: Set to `disabled` or `metadata`
22+
23+
### Behavior Matrix
24+
25+
| DisablePowerOff | CAPM3_FAST_TRACK | AutomatedCleaningMode | BMH Online Status |
26+
| --------------- | ---------------- | --------------------- | ------------------------ |
27+
| true | any | any | **On** (DisablePowerOff) |
28+
| false | false | disabled | Off |
29+
| false | true | disabled | Off |
30+
| false | false | metadata | Off |
31+
| false | true | metadata | **On** (Fast Track) |
32+
33+
Since PR #3106 (`Allow disable_power_off together with autoclean`, merged on
34+
2026-03-05), `DisablePowerOff=true` can be combined with
35+
`AutomatedCleaningMode=metadata`. In that case, `DisablePowerOff` still takes
36+
priority and keeps the host online.
37+
38+
The host remains online when:
39+
40+
- `DisablePowerOff` is `true` (takes priority over all other settings), or
41+
- `AutomatedCleaningMode` is set to `metadata` (not `disabled`) AND
42+
`CAPM3_FAST_TRACK` is set to `true`
43+
44+
When both conditions are met, the BareMetalHost remains powered on after the
45+
Metal3Machine is deleted, allowing it to be quickly re-claimed by a new
46+
Metal3Machine without waiting for a full power cycle.
47+
48+
## When to Use Fast Track
49+
50+
Fast Track is beneficial in scenarios where:
51+
52+
- **Rolling upgrades**: Nodes are being replaced one at a time, and speed is
53+
important
54+
- **Controlled environments**: You trust that metadata cleaning is sufficient
55+
(no need for full disk wipes)
56+
- **Development/testing**: Quick iteration cycles where full deprovisioning
57+
adds unnecessary delay
58+
59+
Fast Track should be avoided when:
60+
61+
- **Security is paramount**: Full disk wiping is required between tenants or
62+
workloads
63+
- **Disk cleaning is needed**: The `disabled` AutomatedCleaningMode is used
64+
intentionally
65+
- **Troubleshooting**: You need hosts to fully deprovision to reset state
66+
67+
## Configuration
68+
69+
### Enabling Fast Track
70+
71+
Fast Track is configured via the `CAPM3_FAST_TRACK` environment variable on the
72+
CAPM3 controller. It defaults to `false`.
73+
74+
#### Via clusterctl
75+
76+
Add to your clusterctl configuration file:
77+
78+
```yaml
79+
variables:
80+
CAPM3_FAST_TRACK: "true"
81+
```
82+
83+
#### Via Environment Variable
84+
85+
If deploying the controller directly, set the environment variable:
86+
87+
```bash
88+
export CAPM3_FAST_TRACK=true
89+
```
90+
91+
### Configuring AutomatedCleaningMode
92+
93+
For Fast Track to work, your BareMetalHost resources must have
94+
`AutomatedCleaningMode` set to `metadata`. This can also be combined with
95+
`DisablePowerOff=true`; in that case `DisablePowerOff` remains the deciding
96+
factor for keeping the host online. Refer to the
97+
[Baremetal Operator Documentation](https://book.metal3.io/bmo/introduction)
98+
for details on configuring this field.
99+
100+
## Behavior During Machine Deletion
101+
102+
When a Metal3Machine is being deleted:
103+
104+
1. CAPM3 clears the BareMetalHost's image, customDeploy, userData, metaData,
105+
and networkData references
106+
1. Based on the AutomatedCleaningMode and CAPM3_FAST_TRACK values, CAPM3 sets
107+
the BMH's `Online` field:
108+
- **Fast Track active**: Host stays online (`Online: true`)
109+
- **Fast Track inactive**: Host is powered off (`Online: false`)
110+
1. CAPM3 waits for the host to reach an available state before fully releasing
111+
it
112+
1. The ConsumerRef and other association data are cleared
113+
114+
## Monitoring
115+
116+
When Fast Track is active, you'll see log messages like:
117+
118+
```text
119+
Set host Online field based on DisablePowerOff, AutomatedCleaningMode, and Capm3FastTrack host=node-0 automatedCleaningMode=metadata hostSpecOnline=true
120+
```
121+
122+
When Fast Track keeps a host online, the BareMetalHost will transition directly
123+
from `Provisioned` to `Available` without going through `Deprovisioning` with a
124+
power-off cycle.
125+
126+
## Troubleshooting
127+
128+
### Host Not Staying Online
129+
130+
If hosts are being powered off despite Fast Track being enabled:
131+
132+
1. Verify `CAPM3_FAST_TRACK` is set to `true` (not `True` or `1`)
133+
1. Check that `AutomatedCleaningMode` is `metadata`, not `disabled`
134+
1. Review controller logs for the decision logic
135+
136+
### Host Stuck in Deprovisioning
137+
138+
If a host appears stuck, it may be waiting for cleaning to complete. Check:
139+
140+
1. The Ironic conductor logs for cleaning status
141+
1. The BareMetalHost status for any error messages
142+
1. Whether the host's BMC is accessible
143+
144+
## Related Documentation
145+
146+
- [Automated Cleaning](./automated_cleaning.md) - Details on automated cleaning
147+
modes
148+
- [IP Reuse](./ip_reuse.md) - Related feature for predictable IP allocation
149+
during upgrades
150+
- [Baremetal Operator Automated Cleaning](../bmo/automated_cleaning.md) - BMH
151+
lifecycle and cleaning modes

docs/user-guide/src/capm3/features.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,5 @@
1010
- [Failure domain](./failure_domain.md)
1111
- [Annotation-based IPPool](./annotation_based_ippool.md)
1212
- [Controller pod placement](./pod_placement.md)
13+
- [Fast Track](./fast_track.md)
14+
- [IP Reuse](./ip_reuse.md)
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
# IP Reuse (BMH Name-Based Preallocation)
2+
3+
## Overview
4+
5+
IP reuse, also known as BMH Name-Based Preallocation, is a feature that enables
6+
predictable IP address assignment to BareMetalHosts. This is particularly useful
7+
during rolling upgrades where you want nodes to retain their IP addresses even
8+
as the underlying Metal3Machine and Metal3Data objects are recreated.
9+
10+
## Default Behavior (Without BMH Name-Based Preallocation)
11+
12+
As is now, IPPool is an object representing a set of IPAddress pools to be used
13+
for IPAddress allocations. An IPClaim is an object representing a request for an
14+
IPAddress allocation. Consequently, the IPClaim object name is structured as
15+
following:
16+
17+
**IPClaimName** = **Metal3DataName** + **(-)** + **IPPoolName**
18+
19+
Example: metal3datatemplate-0-pool0
20+
21+
The `Metal3DataName` is derived from the `Metal3DataTemplateName` with an added
22+
index (`Metal3DataTemplateName-index`), and the `IPPoolName` comes from the
23+
IPPool object directly. (See the
24+
[IP Address manager](../ipam/introduction.md)
25+
for more details on these objects). In the CAPM3 workflow, when a Metal3Machine
26+
is created and a Metal3Data object is requested, the process of choosing an
27+
`index` to be appended to the name of the `Metal3DataTemplateName`, is random.
28+
For example, let's imagine we have two Metal3Machines: `metal3machine-0` and
29+
`metal3machine-1` which creates the following `metal3datatemplate-0` and
30+
`metal3datatemplate-1` Metal3Data objects respectively. However, if two nodes
31+
are being upgraded at a time, there is no guarantee that same indices will be
32+
appended to the respective objects and in fact it can be in completely reverse
33+
order (i.e. `metal3machine-0` will get `m3datatemplate-1` and `metal3machine-1`
34+
will get `m3datatemplate-0`). In order to make it predictable, we structure
35+
IPClaim object name using the BareMetalHost name, as following:
36+
37+
**IPClaimName** = **BareMetalHostName** + **(-)** + **IPPoolName**
38+
39+
Example: node-0-pool0
40+
41+
Now, the first part consists of `BareMetalHostName` which is the name of the
42+
BareMetalHost object, and should always stay the same once created
43+
(predictable). The second part of it is kept unchanged.
44+
45+
## What is the use of PreAllocations field
46+
47+
Once we have a predictable `IPClaimName`, we can make use of a
48+
`PreAllocations map[string]IPAddressStr` field in the IPPool object to achieve
49+
our goal.
50+
51+
We simply add the claim name(s) using the new format (BareMetalHost name
52+
included) to the `preAllocations` field in the `IPPool`, i.e:
53+
54+
```yaml
55+
apiVersion: ipam.metal3.io/v1alpha1
56+
kind: IPPool
57+
metadata:
58+
name: baremetalv4-pool
59+
namespace: metal3
60+
spec:
61+
clusterName: test1
62+
gateway: 192.168.111.1
63+
namePrefix: test1-bmv4
64+
pools:
65+
- end: 192.168.111.200
66+
start: 192.168.111.100
67+
prefix: 24
68+
preAllocations:
69+
node-0-pool0: 192.168.111.101
70+
node-1-pool0: 192.168.111.102
71+
status:
72+
indexes:
73+
node-0-pool0: 192.168.111.101
74+
node-1-pool0: 192.168.111.102
75+
```
76+
77+
Since claim names include BareMetalHost names on them, we are able to predict an
78+
IPAddress assigned to the specific node.
79+
80+
## How to Enable BMH Name-Based Preallocation
81+
82+
To enable the feature, a boolean flag called `enable-bmh-name-based-preallocation`
83+
was added. It is configurable via clusterctl and it can be passed to the
84+
clusterctl configuration file by the user.
85+
86+
### Via clusterctl Configuration
87+
88+
Add to your `"${XDG_CONFIG_HOME}"/.config/cluster-api/clusterctl.yaml`:
89+
90+
```yaml
91+
variables:
92+
ENABLE_BMH_NAME_BASED_PREALLOCATION: "true"
93+
```
94+
95+
### Via Controller Flag
96+
97+
The CAPM3 controller accepts a flag:
98+
99+
```bash
100+
--enable-bmh-name-based-preallocation=true
101+
```
102+
103+
This flag enables the BMH name-based IPClaim naming scheme.
104+
105+
## Use Cases
106+
107+
### Rolling Upgrades with Stable IPs
108+
109+
When performing a rolling upgrade of your cluster:
110+
111+
1. Each BareMetalHost has a stable name (e.g., `node-0`, `node-1`)
112+
1. With preallocation enabled, IPClaims are named using the BMH name
113+
1. Pre-populate the IPPool's `preAllocations` field with the desired mappings
114+
1. As nodes are upgraded, they automatically receive their pre-assigned IPs
115+
116+
### DNS and Certificate Management
117+
118+
Stable IP addresses simplify:
119+
120+
- DNS record management (no need to update records after upgrades)
121+
- Certificate provisioning (certificates tied to specific IPs remain valid)
122+
- Firewall rules (static IP-based rules don't need updates)
123+
124+
### Multi-Cluster Deployments
125+
126+
When managing multiple clusters, predictable IPs help with:
127+
128+
- Network segmentation and planning
129+
- Monitoring and alerting configurations
130+
- Load balancer backend configurations
131+
132+
## Interaction with Metal3Data Labels
133+
134+
When BMH name-based preallocation is enabled, additional labels are added to
135+
Metal3Data objects to track the association:
136+
137+
- `infrastructure.cluster.x-k8s.io/data-name` (`DataLabelName`) stores the
138+
Metal3Data name
139+
- `infrastructure.cluster.x-k8s.io/pool-name` (`PoolLabelName`) stores the
140+
referenced pool name
141+
142+
These labels make it possible to track which Metal3Data object and pool were
143+
used for a given allocation.
144+
145+
## Considerations
146+
147+
### BareMetalHost Naming
148+
149+
For this feature to work effectively:
150+
151+
- BareMetalHost names must be stable and predictable
152+
- Avoid using generated names that change between deployments
153+
- Use meaningful names that reflect the physical hardware (e.g., rack position)
154+
155+
### IPPool Configuration
156+
157+
When setting up preAllocations:
158+
159+
- The claim name format is: `{bmh-name}-{pool-name}`
160+
- Ensure all expected BMH names are covered in the preAllocations map
161+
- IPs in preAllocations are reserved and won't be allocated to other claims
162+
163+
### Cleanup
164+
165+
When a Metal3Machine is deleted:
166+
167+
- The IPClaim is released
168+
- The preallocated IP remains reserved in the pool
169+
- When a new Metal3Machine claims the same BMH, it gets the same IP
170+
171+
## Troubleshooting
172+
173+
### IP Not Being Reused
174+
175+
1. Verify `--enable-bmh-name-based-preallocation` or
176+
`ENABLE_BMH_NAME_BASED_PREALLOCATION` is enabled
177+
1. Check that the IPPool `preAllocations` field includes the correct mapping
178+
1. Verify the claim name format matches: `{bmh-name}-{pool-name}`
179+
180+
### IPClaim Name Mismatch
181+
182+
If IPClaims are not using BMH names:
183+
184+
1. Check controller logs for preallocation-related messages
185+
1. Verify the `--enable-bmh-name-based-preallocation` flag is properly set on
186+
the controller deployment
187+
1. Restart the controller after changing the configuration

0 commit comments

Comments
 (0)