1
- # Distributed Testing
1
+ # Testing
2
2
3
- Install Ray on at least two nodes.
3
+ * [ Single Node Testing] ( #Single-Node-Testing )
4
+ * [ Distributed Testing] ( #Distributed-Testing )
5
+ * [ Ray Installation Docs] ( https://docs.ray.io/en/latest/ray-overview/installation.html )
6
+
7
+ ## Single Node Testing
4
8
5
- https://docs.ray.io/en/latest/ray-overview/installation.html
9
+ Install Ray on one (head) node.
6
10
7
11
``` shell
8
12
sudo apt install -y python3-pip python3.12-venv
@@ -11,19 +15,23 @@ source venv/bin/activate
11
15
pip3 install -U " ray[default]"
12
16
```
13
17
14
- ## Start Ray Head Node
18
+ ### Start Ray Head Node
15
19
16
20
``` shell
17
- ray start --head --node-ip-address=10 .0.0.23 --port=6379 -- dashboard-host=0.0.0.0
21
+ ray start --head --dashboard-host 0 .0.0.0 --include- dashboard true
18
22
```
19
23
20
- ## Start Ray Worker Nodes(s)
24
+ ### Start Ray Worker Nodes(s) (Optional)
25
+
26
+ This is optional, if you go add Ray worker noeds, it becomes distributed.
27
+
28
+ Also [ Ray doesn't support MacOS multi-node cluster] ( https://docs.ray.io/en/latest/cluster/getting-started.html#where-can-i-deploy-ray-clusters )
21
29
22
30
``` shell
23
- ray start --address=10 .0.0.23 :6379 --redis-password= ' 5241590000000000 '
31
+ ray start --address=127 .0.0.1 :6379
24
32
```
25
33
26
- ## Install DataFusion Ray (on each node)
34
+ ### Install DataFusion Ray (on head node)
27
35
28
36
Clone the repo with the version that you want to test. Run ` maturin build --release ` in the virtual env.
29
37
@@ -42,9 +50,80 @@ cd datafusion-ray
42
50
maturin develop --release
43
51
```
44
52
45
- ## Submit Job
53
+ ### Submit Job
54
+
55
+ 1 . If started the cluster manually, simply connect to the existing cluster instead of reinitializing it.
56
+ ``` python
57
+ # Start a local cluster
58
+ # ray.init(resources={"worker": 1})
59
+
60
+ # Connect to a cluster
61
+ ray.init()
62
+ ```
63
+
64
+ 2 . Submit the job to Ray Cluster
65
+ ``` shell
66
+ RAY_ADDRESS=' http://127.0.0.1:8265' ray job submit --working-dir examples -- python3 tips.py
67
+ ```
68
+
69
+ ## Distributed Testing
70
+
71
+ Install Ray on at least two nodes.
72
+
73
+ ``` shell
74
+ sudo apt install -y python3-pip python3.12-venv
75
+ python3 -m venv venv
76
+ source venv/bin/activate
77
+ pip3 install -U " ray[default]"
78
+ ```
79
+
80
+ ### Start Ray Head Node
81
+
82
+ ``` shell
83
+ ray start --head --dashboard-host 0.0.0.0 --include-dashboard true
84
+ ```
85
+
86
+ ### Start Ray Worker Nodes(s)
87
+
88
+ Replace ` NODE_IP_ADDRESS ` with the address accessible in your distributed setup, which will be displayed after the previous step.
89
+
90
+ ``` shell
91
+ ray start --address={NODE_IP_ADDRESS}:6379
92
+ ```
93
+
94
+ ### Install DataFusion Ray (on each node)
95
+
96
+ Clone the repo with the version that you want to test. Run ` maturin build --release ` in the virtual env.
97
+
98
+ ``` shell
99
+ curl --proto ' =https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
100
+ . " $HOME /.cargo/env"
101
+ ```
102
+
103
+ ``` shell
104
+ pip3 install maturin
105
+ ```
106
+
107
+ ``` shell
108
+ git clone https://github.com/apache/datafusion-ray.git
109
+ cd datafusion-ray
110
+ maturin develop --release
111
+ ```
112
+
113
+ ### Submit Job
114
+
115
+ 1 . If starting the cluster manually, simply connect to the existing cluster instead of reinitializing it.
116
+
117
+ ``` python
118
+ # Start a local cluster
119
+ # ray.init(resources={"worker": 1})
120
+
121
+ # Connect to a cluster
122
+ ray.init()
123
+ ```
124
+
125
+ 2 . Submit the job to Ray Cluster
46
126
47
127
``` shell
48
- cd examples
49
- RAY_ADDRESS=' http://10.0.0.23:8265' ray job submit --working-dir ` pwd` -- python3 tips.py
128
+ RAY_ADDRESS=' http://{NODE_IP_ADDRESS}:8265' ray job submit --working-dir examples -- python3 tips.py
50
129
```
0 commit comments