|
| 1 | +# VirtTest Cluster (`vt_cluster`) |
| 2 | + |
| 3 | +The `virttest.vt_cluster` module provides a comprehensive framework for managing distributed |
| 4 | +virtualization testing environments. It orchestrates tests across multiple remote machines |
| 5 | +from a central controller, enabling scalable test execution in multi-host scenarios. |
| 6 | + |
| 7 | +## Core Concepts |
| 8 | + |
| 9 | +The framework is built around several key concepts: |
| 10 | + |
| 11 | +* **Cluster:** The central management entity that maintains the state of the |
| 12 | + entire distributed environment. It is a singleton object that tracks all nodes, |
| 13 | + their configurations, and manages partitions with automatic state persistence. |
| 14 | +* **Node:** Represents a single machine within the cluster. Each node can be a |
| 15 | + remote machine that runs agent processes. Nodes handle SSH connections, file |
| 16 | + transfers, agent deployment, and environment setup/cleanup. |
| 17 | +* **Partition:** A logical group of nodes allocated for a specific job or test run. |
| 18 | + This enables resource isolation and execution of multiple tests on different |
| 19 | + node sets. |
| 20 | +* **Agent:** A daemon process running on remote nodes that exposes an XML-RPC API |
| 21 | + for the controller to execute commands, manage services, and coordinate test execution. |
| 22 | +* **Proxy:** The communication layer that handles XML-RPC calls between controller |
| 23 | + and agents, it handles RPC calls and provides a seamless way to invoke methods |
| 24 | + on remote objects. |
| 25 | + |
| 26 | +## Architecture |
| 27 | + |
| 28 | +The `vt_cluster` module follows a controller-agent architecture: |
| 29 | + |
| 30 | +* **controller:** The main process that orchestrates the tests. It holds the `Cluster` |
| 31 | + object, which knows about all registered nodes. |
| 32 | +* **Agents:** Remote nodes that execute the actual test commands. The `Node` class |
| 33 | + on the controller is responsible for setting up and managing the agent on the |
| 34 | + corresponding remote machine. |
| 35 | + |
| 36 | +### Communication |
| 37 | + |
| 38 | +* **RPC:** Commands are sent from the controller to agents using XML-RPC. The `proxy.py` |
| 39 | + module implements a client proxy. |
| 40 | +* **Session & File Management:** SSH and SCP are used for initial agent setup, |
| 41 | + file transfers (including copying necessary libraries and collecting logs), |
| 42 | + and managing the agent daemon's lifecycle. |
| 43 | + |
| 44 | +### State Persistence |
| 45 | + |
| 46 | +The state of the cluster (including the list of nodes, their configuration, and |
| 47 | +active partitions) is persisted to a `cluster_env` file in the backend data directory. |
| 48 | +This state is serialized using `pickle`, allowing it to be restored across different processes. |
| 49 | + |
| 50 | +## How It Works |
| 51 | + |
| 52 | +1. **Initialization:** The `_Cluster` object is initialized, loading any previously |
| 53 | + saved state from the `cluster_env` file. |
| 54 | +2. **Node Registration:** Test configurations define the available nodes, |
| 55 | + which are then registered with the cluster using `cluster.register_node()`. |
| 56 | +3. **Agent Setup:** For each remote node, the controller: |
| 57 | + a. Connects via SSH. |
| 58 | + b. Copies the required Python libraries to a directory on the agent. |
| 59 | + c. Starts the agent server daemon. |
| 60 | +4. **Running a Test:** |
| 61 | + a. A test requests a `partition` of one or more nodes from the cluster. |
| 62 | + b. The test interacts with the nodes in its partition through the `Node` |
| 63 | + object and its `proxy` attribute. |
| 64 | + c. All method calls on the proxy are transparently sent to the remote agent |
| 65 | + for execution (e.g., `node.proxy.foo.boo()`). |
| 66 | +5. **File Operations:** The controller can transfer files to/from remote nodes using |
| 67 | + SCP operations. This includes copying test data, collecting logs, and transferring |
| 68 | + results between nodes and the controller. |
| 69 | + |
| 70 | +## Module Structure |
| 71 | + |
| 72 | +* `__init__.py`: Core cluster management with `_Cluster` and `_Partition` classes. |
| 73 | + Provides cluster state persistence, node registration, and partition management. |
| 74 | + Exports the global `cluster` instance for application use. |
| 75 | +* `node.py`: Node management with the `Node` class and `NodeError` exception. |
| 76 | + Handles SSH connections, agent deployment, environment setup/cleanup, and |
| 77 | + file transfer operations. Includes comprehensive docstrings with parameter types. |
| 78 | +* `proxy.py`: XML-RPC communication layer with `_ClientProxy` and `ServerProxyError`. |
| 79 | + Implements transparent method calls for distributed operations. |
| 80 | + |
| 81 | +## Usage Example |
| 82 | + |
| 83 | +The following example demonstrates how to initialize the cluster, register nodes, |
| 84 | +create a partition, and interact with remote agents. |
| 85 | + |
| 86 | +```python |
| 87 | +""" |
| 88 | +Example of how to use the vt_cluster framework. |
| 89 | +
|
| 90 | +This example demonstrates: |
| 91 | +1. Initializing the cluster. |
| 92 | +2. Defining and registering two remote nodes. |
| 93 | +3. Creating a partition and allocating nodes to it. |
| 94 | +4. Setting up the agent environment on each node. |
| 95 | +5. Starting the agent servers. |
| 96 | +6. Interacting with the agents via the proxy. |
| 97 | +7. Stopping the agents and cleaning up the environment. |
| 98 | +""" |
| 99 | +from virttest.vt_cluster import cluster |
| 100 | +from virttest.vt_cluster.node import Node |
| 101 | + |
| 102 | +# 1. Define node configurations |
| 103 | +# In a real scenario, this would come from a config file. |
| 104 | +node1_params = { |
| 105 | + "address": "192.168.122.101", |
| 106 | + "hostname": "localhost1", |
| 107 | + "username": "root", |
| 108 | + "password": "password", |
| 109 | + "proxy_port": "8000", |
| 110 | + "shell_port": "22", |
| 111 | +} |
| 112 | +node2_params = { |
| 113 | + "address": "192.168.122.102", |
| 114 | + "hostname": "localhost2", |
| 115 | + "username": "root", |
| 116 | + "password": "password", |
| 117 | + "proxy_port": "8000", |
| 118 | + "shell_port": "22", |
| 119 | +} |
| 120 | + |
| 121 | +# 2. Instantiate and register nodes |
| 122 | +node1 = Node(params=node1_params, name="node1") |
| 123 | +node2 = Node(params=node2_params, name="node2") |
| 124 | + |
| 125 | +cluster.register_node(name="node1", node=node1) |
| 126 | +cluster.register_node(name="node2", node=node2) |
| 127 | + |
| 128 | +# 3. Create a partition and add nodes to it |
| 129 | +partition = cluster.create_partition() |
| 130 | +partition.add_node(node1) |
| 131 | +partition.add_node(node2) |
| 132 | + |
| 133 | +# 4. Setup and manage nodes in the partition |
| 134 | +for node in partition.nodes: |
| 135 | + try: |
| 136 | + print(f"Setting up agent on {node.name}...") |
| 137 | + node.setup_agent_env() |
| 138 | + |
| 139 | + print(f"Starting agent server on {node.name}...") |
| 140 | + node.start_agent_server() |
| 141 | + |
| 142 | + # 5. Interact with the remote agent |
| 143 | + if node.proxy.core.is_alive(): |
| 144 | + print(f"Agent on {node.name} is alive.") |
| 145 | + # Example of a remote call |
| 146 | + greeting = node.proxy.examples.hello.ping() |
| 147 | + print(f"Service Response: {greeting}") |
| 148 | + else: |
| 149 | + print(f"Agent on {node.name} failed to start.") |
| 150 | + |
| 151 | + except Exception as e: |
| 152 | + print(f"An error occurred on {node.name}: {e}") |
| 153 | + |
| 154 | + finally: |
| 155 | + # 6. Clean up the node |
| 156 | + print(f"Stopping agent on {node.name}...") |
| 157 | + node.stop_agent_server() |
| 158 | + print(f"Cleaning up environment on {node.name}...") |
| 159 | + node.cleanup_agent_env() |
| 160 | + |
| 161 | +# 7. Clear the partition when done |
| 162 | +cluster.remove_partition(partition) |
| 163 | + |
| 164 | +# 8. Unregister the nodes when done |
| 165 | +cluster.unregister_node(name="node1") |
| 166 | +cluster.unregister_node(name="node2") |
| 167 | +``` |
0 commit comments