Skip to content

Commit ebd003a

Browse files
committed
fabtests/README.md: enhance the README
Add more build instructions and high level usage examples which covers different execution paths. Signed-off-by: Shi Jin <sjina@amazon.com>
1 parent 7e2ad37 commit ebd003a

File tree

1 file changed

+151
-0
lines changed

1 file changed

+151
-0
lines changed

fabtests/README.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,36 @@ Directory where valgrind is installed. If valgrind is found, then
6565
valgrind annotations are enabled. This may incur a performance
6666
penalty.
6767

68+
```
69+
--with-cuda[=DIR]
70+
```
71+
72+
Provide path to where the CUDA development and runtime libraries are installed. This enables CUDA memory support for heterogeneous memory (HMEM) testing.
73+
74+
```
75+
--with-rocr[=DIR]
76+
```
77+
78+
Provide path to where the ROCR development and runtime libraries are installed. This enables ROCr memory support for heterogeneous memory (HMEM) testing.
79+
80+
```
81+
--with-neuron[=DIR]
82+
```
83+
84+
Provide path to where the Neuron development and runtime libraries are installed. This enables Neuron memory support for heterogeneous memory (HMEM) testing.
85+
86+
```
87+
--with-synapseai[=DIR]
88+
```
89+
90+
Enable SynapseAI build and fail if not found. Optional=<Path to where the SynapseAI libraries and headers are installed.> This enables SynapseAI memory support for heterogeneous memory (HMEM) testing.
91+
92+
```
93+
--with-ze[=DIR]
94+
```
95+
96+
Enable Level-Zero (Ze) support and fail if not found. Optional=<Path to where the Level-Zero libraries and headers are installed.> This enables Intel GPU memory support for heterogeneous memory (HMEM) testing.
97+
6898
### Examples
6999

70100
Consider the following example:
@@ -87,3 +117,124 @@ Tells the Fabtests that it should be able to find the Libfabric header
87117
files and libraries in default compiler / linker search paths
88118
(configure will abort if it is not able to find them), and to install
89119
Fabtests in `/opt/fabtests`.
120+
121+
## Installation Location
122+
123+
After running `make install`, the fabtests binaries will be installed to:
124+
- `<prefix>/bin/` - All test executables and scripts (e.g., `fi_rdm_pingpong`, `fi_msg_bw`, `runfabtests.sh`, `runfabtests.py`)
125+
- `<prefix>/share/fabtests/` - Test configuration files and utilities
126+
127+
For detailed documentation of individual test binaries and their options, see [fabtests/man/fabtests.7.md](man/fabtests.7.md) which contains comprehensive man pages for each test binary.
128+
129+
## Running Fabtests
130+
131+
Fabtests can be run in three different ways. **Note**: Ensure `<prefix>/bin` is in your `$PATH` or use absolute paths to the binaries and scripts.
132+
133+
### 1. Direct Binary Execution
134+
135+
Run individual test binaries directly. Each binary supports `-h` for detailed options:
136+
```bash
137+
# Server side (starts server and waits for client)
138+
$ fi_rdm_pingpong -p <provider_name>
139+
140+
# Client side (from another terminal/node)
141+
$ fi_rdm_pingpong -p <provider_name> <server_ip>
142+
143+
# With specific options (-S: transfer size in bytes, -I: number of iterations)
144+
$ fi_rdm_pingpong -p <provider_name> -S 1024 -I 1000
145+
146+
# For providers requiring out-of-band address exchange (e.g., efa)
147+
$ fi_rdm_pingpong -p <provider_name> -E
148+
$ fi_rdm_pingpong -p <provider_name> -E <server_ip>
149+
```
150+
151+
Example provider names (available in libfabric/prov/): `tcp`, `shm`, `efa`, `verbs`, `psm3`, `opx`, `cxi`, `ucx`, etc.
152+
153+
Common test binaries include:
154+
- `fi_msg_pingpong` - MSG endpoint ping-pong latency test
155+
- `fi_rdm_pingpong` - RDM endpoint ping-pong test
156+
- `fi_msg_bw` - MSG endpoint bandwidth measurement test
157+
- `fi_rdm_tagged_pingpong` - RDM endpoint tagged message ping-pong test
158+
- `fi_rma_pingpong` - RMA ping-pong test
159+
- `fi_rma_bw` - RMA bandwidth test
160+
161+
### 2. Bash Script (runfabtests.sh)
162+
163+
Use the comprehensive test script for automated testing:
164+
```bash
165+
# Run quick test suite with sockets provider in loopback
166+
$ ./runfabtests.sh
167+
168+
# Run with specific provider
169+
$ ./runfabtests.sh tcp
170+
171+
# Run tests between two nodes
172+
$ ./runfabtests.sh tcp <server_ip> <client_ip>
173+
174+
# For providers requiring out-of-band address exchange (e.g., efa)
175+
$ ./runfabtests.sh -b efa <server_ip> <client_ip>
176+
177+
# Run specific test sets
178+
$ ./runfabtests.sh -t standard tcp
179+
$ ./runfabtests.sh -t "quick,functional" tcp
180+
181+
# Exclude specific tests
182+
$ ./runfabtests.sh -e "dgram,rma.*write" tcp
183+
184+
# Verbose output for debugging
185+
$ ./runfabtests.sh -vvv tcp
186+
```
187+
188+
Available test sets: `all`, `quick`, `unit`, `functional`, `standard`, `short`, `complex`, `threaded`
189+
190+
### 3. Python Script (runfabtests.py)
191+
192+
Use the Python-based test runner built on the pytest framework for advanced testing configurations. Use `-h` for detailed help:
193+
194+
**Prerequisites**: Install Python dependencies first:
195+
```bash
196+
$ pip install -r fabtests/pytest/requirements.txt
197+
```
198+
199+
**Usage**:
200+
```bash
201+
# Run quick test suite
202+
$ python runfabtests.py <provider_name> <server_ip> <client_ip>
203+
204+
# For providers requiring out-of-band address exchange (e.g., efa)
205+
$ python runfabtests.py -b <provider_name> <server_ip> <client_ip>
206+
207+
# Run specific test sets
208+
$ python runfabtests.py -t standard <provider_name> <server_ip> <client_ip>
209+
$ python runfabtests.py -t "quick,functional" <provider_name> <server_ip> <client_ip>
210+
211+
# Test with CUDA memory (HMEM)
212+
$ python runfabtests.py -t cuda_memory <provider_name> <server_ip> <client_ip>
213+
214+
# Test with Neuron memory
215+
$ python runfabtests.py -t neuron_memory <provider_name> <server_ip> <client_ip>
216+
217+
# Generate HTML and JUnit XML reports
218+
$ python runfabtests.py --html=report.html --junit-xml=results.xml <provider_name> <server_ip> <client_ip>
219+
220+
# Run with multiple parallel workers
221+
$ python runfabtests.py --nworkers=4 <provider_name> <server_ip> <client_ip>
222+
223+
# Filter tests by expression
224+
$ python runfabtests.py --expression="pingpong" <provider_name> <server_ip> <client_ip>
225+
```
226+
227+
The Python script is well tested with `tcp`, `shm`, and `efa` providers. It includes:
228+
- **Common test items**: Defined in `fabtests/pytest/` directory, applied to any provider
229+
- **Provider-specific tests**: Located in `fabtests/pytest/<provider_name>/` directories
230+
- Currently implemented for `shm` and `efa` providers
231+
- We welcome more providers to join this framework!
232+
233+
The Python script leverages the pytest framework, providing advanced testing configurations including:
234+
- Support for heterogeneous memory (HMEM) types like CUDA and Neuron
235+
- Parallel test execution with configurable worker count
236+
- HTML and JUnit XML report generation
237+
- Advanced test filtering and exclusion capabilities
238+
- Comprehensive logging and verbosity controls
239+
240+
Both script methods provide comprehensive testing across multiple providers and configurations, while direct binary execution allows for focused testing of specific scenarios.

0 commit comments

Comments
 (0)