Skip to content

Sync PR #1306 (Launching Kubernetes on Windows Clusters documentation) from upstream #85

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,55 +1,40 @@
= Launching Kubernetes on Windows Clusters

When provisioning a xref:cluster-deployment/custom-clusters/custom-clusters.adoc[custom cluster] using Rancher, Rancher uses RKE (the Rancher Kubernetes Engine) to install Kubernetes on your existing nodes.
When provisioning a xref:cluster-deployment/custom-clusters/custom-clusters.adoc[custom cluster] Rancher uses RKE2 to install Kubernetes on your existing nodes.

In a Windows cluster provisioned with Rancher, the cluster must contain both Linux and Windows nodes. The Kubernetes controlplane can only run on Linux nodes, and the Windows nodes can only have the worker role. Windows nodes can only be used for deploying workloads.

Some other requirements for Windows clusters include:

* You can only add Windows nodes to a cluster if Windows support is enabled when the cluster is created. Windows support cannot be enabled for existing clusters.
* Kubernetes 1.15+ is required.
* The Flannel network provider must be used.
* Windows nodes must have 50 GB of disk space.

For the full list of requirements, see <<_requirements_for_windows_clusters,this section.>>

For a summary of Kubernetes features supported in Windows, see the Kubernetes documentation on https://kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#supported-functionality-and-limitations[supported functionality and limitations for using Kubernetes with Windows] or the https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-containers/[guide for scheduling Windows containers in Kubernetes].

== {rke2-product-name} Windows
== {rke2-product-name} Features for Windows Clusters

The RKE2 provisioning feature also includes installing RKE2 on Windows clusters. Windows features for RKE2 include:
Listed below are the primary RKE2 features for Windows cluster provisioning:

* Windows Containers with RKE2 powered by containerd
* Added provisioning of Windows RKE2 custom clusters directly from the Rancher UI
* Calico CNI for Windows RKE2 custom clusters
* SAC releases of Windows Server (2004 and 20H2) are included in the technical preview

Windows Support for RKE2 Custom Clusters requires choosing Calico as the CNI.

[NOTE]
====

Rancher will allow Windows workload pods to deploy on both Windows and Linux worker nodes by default. When creating mixed clusters in RKE2, you must edit the `nodeSelector` in the chart to direct the pods to be placed onto a compatible Windows node. Refer to the https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector[Kubernetes documentation] for more information on how to use `nodeSelector` to assign pods to nodes.
====


* HostProcess containers in Windows RKE2 are supported in Kubernetes v1.24.1 and up. See https://kubernetes.io/docs/tasks/configure-pod-container/create-hostprocess-pod/[the upstream documentation] for more information.

== Requirements for Windows Clusters
== General Requirements

The general node requirements for networking, operating systems, and Docker are the same as the node requirements for a xref:installation-and-upgrade/requirements/requirements.adoc[Rancher installation].
The general networking and operating system requirements for Windows nodes are the same as for other xref:installation-and-upgrade/requirements/requirements.adoc[Rancher installations].

=== OS and Docker Requirements
=== OS Requirements

Our support for Windows Server and Windows containers match the Microsoft official lifecycle for LTSC (Long-Term Servicing Channel) and SAC (Semi-Annual Channel).

For the support lifecycle dates for Windows Server, see the https://docs.microsoft.com/en-us/windows-server/get-started/windows-server-release-info[Microsoft Documentation.]

=== Kubernetes Version

Kubernetes v1.15+ is required.

If you are using Kubernetes v1.21 with Windows Server 20H2 Standard Core, the patch "2019-08 Servicing Stack Update for Windows Server" must be installed on the node.
For more information regarding Kubernetes component versions, see the https://www.suse.com/suse-rke2/support-matrix/all-supported-versions/[support matrices for RKE2 versions].

=== Node Requirements

Expand All @@ -65,23 +50,15 @@ Rancher will not provision the node if the node does not meet these requirements

Before provisioning a new cluster, be sure that you have already installed Rancher on a device that accepts inbound network traffic. This is required in order for the cluster nodes to communicate with Rancher. If you have not already installed Rancher, please refer to the xref:installation-and-upgrade/installation-and-upgrade.adoc[installation documentation] before proceeding with this guide.

Rancher only supports Windows using Flannel as the network provider.

There are two network options: https://github.com/coreos/flannel/blob/master/Documentation/backends.md#host-gw[*Host Gateway (L2bridge)*] and https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan[*VXLAN (Overlay)*]. The default option is *VXLAN (Overlay)* mode.

For *Host Gateway (L2bridge)* networking, it's best to use the same Layer 2 network for all nodes. Otherwise, you need to configure the route rules for them. For details, refer to the link:network-requirements-for-host-gateway.adoc#cloud-hosted-vm-routes-configuration[documentation on configuring cloud-hosted VM routes.] You will also need to link:network-requirements-for-host-gateway.adoc#disabling-private-ip-address-checks[disable private IP address checks] if you are using Amazon EC2, Google GCE, or Azure VM.

For *VXLAN (Overlay)* networking, the https://support.microsoft.com/en-us/help/4489899[KB4489899] hotfix must be installed. Most cloud-hosted VMs already have this hotfix.
Rancher supports Windows using Calico as the network provider.

If you are configuring DHCP options sets for an AWS virtual private cloud, note that in the `domain-name` option field, only one domain name can be specified. According to the DHCP options https://docs.aws.amazon.com/vpc/latest/userguide/VPC_DHCP_Options.html[documentation:]

[NOTE]
====

Some Linux operating systems accept multiple domain names separated by spaces. However, other Linux operating systems and Windows treat the value as a single domain, which results in unexpected behavior. If your DHCP options set is associated with a VPC that has instances with multiple operating systems, specify only one domain name.
====


=== Rancher on VMware vSphere with ESXi 6.7u2 and above

If you are using Rancher on VMware vSphere with ESXi 6.7u2 or later with Red Hat Enterprise Linux 8.3, CentOS 8.3, or SUSE Enterprise Linux 15 SP2 or later, it is necessary to disable the `vmxnet3` virtual network adapter hardware offloading feature. Failure to do so will result in all network connections between pods on different cluster nodes to fail with timeout errors. All connections from Windows pods to critical services running on Linux nodes, such as CoreDNS, will fail as well. It is also possible that external connections may fail. This issue is the result of Linux distributions enabling the hardware offloading feature in `vmxnet3` and a bug in the `vmxnet3` hardware offloading feature that results in the discarding of packets for guest overlay traffic. To address this issue, it is necessary disable the `vmxnet3` hardware offloading feature. This setting does not survive reboot, so it is necessary to disable on every boot. The recommended course of action is to create a systemd unit file at `/etc/systemd/system/disable_hw_offloading.service`, which disables the `vmxnet3` hardware offloading feature on boot. A sample systemd unit file which disables the `vmxnet3` hardware offloading feature is as follows. Note that `<VM network interface>` must be customized to the host `vmxnet3` network interface, e.g., `ens192`:
Expand Down Expand Up @@ -147,7 +124,7 @@ Windows requires that containers must be built on the same Windows Server versio

=== Cloud Provider Specific Requirements

If you set a Kubernetes cloud provider in your cluster, some additional steps are required. You might want to set a cloud provider if you want to want to leverage a cloud provider's capabilities, for example, to automatically provision storage, load balancers, or other infrastructure for your cluster. Refer to xref:cluster-deployment/set-up-cloud-providers/set-up-cloud-providers.adoc[this page] for details on how to configure a cloud provider cluster of nodes that meet the prerequisites.
If you set a Kubernetes cloud provider in your cluster, some additional steps are required. You may wish to setup a cloud provider to leverage capabilities to automatically provision storage, load balancers, or other infrastructure for your cluster. Refer to xref:cluster-deployment/set-up-cloud-providers/set-up-cloud-providers.adoc[this page] for details on how to configure a cloud provider cluster of nodes that meet the prerequisites.

If you are using the GCE (Google Compute Engine) cloud provider, you must do the following:

Expand All @@ -158,7 +135,7 @@ If you are using the GCE (Google Compute Engine) cloud provider, you must do the

This tutorial describes how to create a Rancher-provisioned cluster with the three nodes in the <<_recommended_architecture,recommended architecture.>>

When you provision a cluster with Rancher on existing nodes, you will add nodes to the cluster by installing the xref:cluster-deployment/custom-clusters/rancher-agent-options.adoc[Rancher agent] on each one. When you create or edit your cluster from the Rancher UI, you will see a *Customize Node Run Command* that you can run on each server to add it to your cluster.
When you provision a cluster with Rancher on existing nodes, you add nodes to the cluster by installing the xref:cluster-deployment/custom-clusters/rancher-agent-options.adoc[Rancher agent] on each one. To create or edit your cluster from the Rancher UI, run the *Registration Command* on each server to add it to your cluster.

To set up a cluster with support for Windows nodes and containers, you will need to complete the tasks below.

Expand Down Expand Up @@ -201,20 +178,10 @@ The instructions for creating a Windows cluster on existing nodes are very simil
. On the *Clusters* page, click *Create*.
. Click *Custom*.
. Enter a name for your cluster in the *Cluster Name* field.
. In the *Kubernetes Version* dropdown menu, select v1.19 or above.
. In the *Network Provider* field, select *Flannel*.
. In the *Windows Support* section, click *Enabled*.
. Optional: After you enable Windows support, you will be able to choose the Flannel backend. There are two network options: https://github.com/coreos/flannel/blob/master/Documentation/backends.md#host-gw[*Host Gateway (L2bridge)*] and https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan[*VXLAN (Overlay)*]. The default option is *VXLAN (Overlay)* mode.
. In the *Kubernetes Version* dropdown menu, select a supported Kubernetes version.
. In the *Container Network* field, select *Calico*.
. Click *Next*.

[NOTE]
.Important:
====

For *Host Gateway (L2bridge)* networking, it's best to use the same Layer 2 network for all nodes. Otherwise, you need to configure the route rules for them. For details, refer to the link:network-requirements-for-host-gateway.adoc#cloud-hosted-vm-routes-configuration[documentation on configuring cloud-hosted VM routes.] You will also need to link:network-requirements-for-host-gateway.adoc#disabling-private-ip-address-checks[disable private IP address checks] if you are using Amazon EC2, Google GCE, or Azure VM.
====


=== 3. Add Nodes to the Cluster

This section describes how to register your Linux and Worker nodes to your cluster. You will run a command on each node, which will install the Rancher agent and allow Rancher to manage each node.
Expand All @@ -223,14 +190,13 @@ This section describes how to register your Linux and Worker nodes to your clust

In this section, we fill out a form on the Rancher UI to get a custom command to install the Rancher agent on the Linux master node. Then we will copy the command and run it on our Linux master node to register the node in the cluster.

The first node in your cluster should be a Linux host has both the *Control Plane* and *etcd* roles. At a minimum, both of these roles must be enabled for this node, and this node must be added to your cluster before you can add Windows hosts.
The first node in your cluster should be a Linux host that has both the *Control Plane* and *etcd* roles. At a minimum, both of these roles must be enabled for this node, and this node must be added to your cluster before you can add Windows hosts.

. In the *Node Operating System* section, click *Linux*.
. In the *Node Role* section, choose at least *etcd* and *Control Plane*. We recommend selecting all three.
. After cluster creation, navigate to the *Registration* tab.
. In *Step 1* under the *Node Role* section, select at least *etcd* and *Control Plane*. We recommend selecting all three.
. Optional: If you click *Show advanced options,* you can customize the settings for the xref:cluster-deployment/custom-clusters/rancher-agent-options.adoc[Rancher agent] and https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/[node labels.]
. Copy the command displayed on the screen to your clipboard.
. In *Step 2*, under the *Registration* section, copy the command displayed on the screen to your clipboard.
. SSH into your Linux host and run the command that you copied to your clipboard.
. When you are finished provisioning your Linux node(s), select *Done*.

*Result:*

Expand All @@ -249,21 +215,18 @@ It may take a few minutes for the node to be registered in your cluster.

In this section, we run a command to register the Linux worker node to the cluster.

After the initial provisioning of your cluster, your cluster only has a single Linux host. Next, we add another Linux `worker` host, which will be used to support _Rancher cluster agent_, _Metrics server_, _DNS_ and _Ingress_ for your cluster.
After the initial provisioning of your cluster, your cluster only has a single Linux host. Add another Linux `worker` host to support the _Rancher cluster agent_, _Metrics server_, _DNS_ and _Ingress_ for your cluster.

. In the upper left corner, click *☰ > Cluster Management*.
. Go to the cluster that you created and click *⋮ > Edit Config*.
. Scroll down to *Node Operating System*. Choose *Linux*.
. In the *Customize Node Run Command* section, go to the *Node Options* and select the *Worker* role.
. Copy the command displayed on screen to your clipboard.
. Log in to your Linux host using a remote Terminal connection. Run the command copied to your clipboard.
. From *Rancher*, click *Save*.
. After cluster creation, navigate to the *Registration* tab.
. In *Step 1* under the *Node Role* section, select *Worker*.
. Optional: If you click *Show advanced options,* you can customize the settings for the xref:../../about-rancher-agents.adoc[Rancher agent] and https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/[node labels.]
. In *Step 2*, under the *Registration* section, copy the command displayed on the screen to your clipboard.
. SSH into your Linux host and run the command that you copied to your clipboard.

*Result:* The *Worker* role is installed on your Linux host, and the node registers with Rancher. It may take a few minutes for the node to be registered in your cluster.

[NOTE]
====

Taints on Linux Worker Nodes

For each Linux worker node added into the cluster, the following taints will be added to Linux worker node. By adding this taint to the Linux worker node, any workloads added to the Windows cluster will be automatically scheduled to the Windows worker node. If you want to schedule workloads specifically onto the Linux worker node, you will need to add tolerations to those workloads.
Expand All @@ -277,19 +240,20 @@ For each Linux worker node added into the cluster, the following taints will be
|===
====


==== Add a Windows Worker Node

In this section, we run a command to register the Windows worker node to the cluster.

You can add Windows hosts to the cluster by editing the cluster and choosing the *Windows* option.
[NOTE]
====
The registration command to add the Windows workers only appears after the cluster is running with Linux etcd, control plane, and worker nodes.
====

. In the upper left corner, click *☰ > Cluster Management*.
. Go to the cluster that you created and click *⋮ > Edit Config*.
. Scroll down to *Node Operating System*. Choose *Windows*. Note: You will see that the *worker* role is the only available role.
. Copy the command displayed on screen to your clipboard.
. After cluster creation, navigate to the *Registration* tab.
. In *Step 1* under the *Node Role* section, select *Worker*.
. Optional: If you click *Show advanced options,* you can customize the settings for the xref:../../about-rancher-agents.adoc[Rancher agent] and https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/[node labels.]
. In *Step 2*, under the *Registration* section, copy the command for Windows workers displayed on the screen to your clipboard.
. Log in to your Windows host using your preferred tool, such as https://docs.microsoft.com/en-us/windows-server/remote/remote-desktop-services/clients/remote-desktop-clients[Microsoft Remote Desktop]. Run the command copied to your clipboard in the *Command Prompt (CMD)*.
. From Rancher, click *Save*.
. Optional: Repeat these instructions if you want to add more Windows nodes to your cluster.

*Result:* The *Worker* role is installed on your Windows host, and the node registers with Rancher. It may take a few minutes for the node to be registered in your cluster. You now have a Windows Kubernetes cluster.
Expand Down
Loading