Skip to content

Commit 328c99f

Browse files
authored
feat: add vllm rust frontend (#457)
1 parent c72d6c8 commit 328c99f

38 files changed

Lines changed: 2168 additions & 963 deletions

.github/workflows/ci.yml

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ on:
55
branches: [ main ]
66
paths:
77
- '.github/workflows/ci.yml'
8+
- 'install.sh'
89
- 'src/**'
910
- 'tests/**'
1011
- 'pyproject.toml'
@@ -72,14 +73,21 @@ jobs:
7273
fail_ci_if_error: false
7374
token: ${{ secrets.CODECOV_TOKEN }}
7475

76+
- name: Install Parallax and vLLM Rust frontend
77+
if: matrix.os == 'macos-26' && matrix.python-version == '3.12'
78+
shell: bash
79+
run: |
80+
./install.sh --extras mac --python "${{ matrix.python-version }}"
81+
test -x .venv/bin/vllm-rs
82+
7583
- name: Run E2E tests (macOS only)
7684
if: matrix.os == 'macos-26' && matrix.python-version == '3.12'
7785
shell: bash
7886
env:
7987
TERM: xterm-256color
8088
run: |
8189
# Start the server
82-
python src/parallax/launch.py \
90+
.venv/bin/python src/parallax/launch.py \
8391
--model-path Qwen/Qwen3-0.6B \
8492
--max-num-tokens-per-batch 16384 \
8593
--kv-block-size 32 \
@@ -123,6 +131,7 @@ jobs:
123131
'http://localhost:3000/v1/chat/completions' \
124132
--header 'Content-Type: application/json' \
125133
--data '{
134+
"model": "Qwen/Qwen3-0.6B",
126135
"messages": [
127136
{
128137
"role": "user",
@@ -132,9 +141,7 @@ jobs:
132141
"stream": false,
133142
"max_tokens": 1024,
134143
"chat_template_kwargs": {"enable_thinking": false},
135-
"sampling_params": {
136-
"top_k": 3
137-
}
144+
"top_k": 3
138145
}')
139146
140147
echo "Response received:"

AGENTS.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
mac env:
2+
python: .venv/bin/python
3+
4+
gpu env:
5+
ssh H200 server, then use conda activate parallax
6+
7+
8+
start a local server:
9+
python src/parallax/launch.py --model-path <MODEL_NAME> --log-level DEBUG
10+
11+
test the server:
12+
curl --location 'http://localhost:3000/v1/chat/completions' --header 'Content-Type: application/json' --data '{
13+
"max_tokens": 1024,
14+
"messages": [
15+
{
16+
"role": "user",
17+
"content": "hello"
18+
}
19+
],
20+
"stream": true
21+
}'
22+
23+
final check:
24+
run when user ask to git commit
25+
pre-commit run --all-files
26+
pytest

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,21 @@ The backend architecture:
5454
- [Getting Started](./docs/user_guide/quick_start.md)
5555
- [Working with OpenClaw 🦞](./docs/user_guide/work_with_openclaw.md)
5656

57+
## Quick Install
58+
59+
```sh
60+
git clone https://github.com/GradientHQ/parallax.git
61+
cd parallax
62+
./install.sh
63+
source .venv/bin/activate
64+
```
65+
66+
The install script installs `uv` if needed, creates `.venv`, installs Parallax,
67+
and builds the `vllm-rs` frontend binary into `.venv/bin`. Use
68+
`./install.sh --extras gpu` on Linux/WSL GPU hosts or `./install.sh --extras mac`
69+
on Apple silicon macOS. For development dependencies, use `--extras gpu,dev` or
70+
`--extras mac,dev`.
71+
5772
## Contributing
5873

5974
We warmly welcome contributions of all kinds! For guidelines on how to get involved, please refer to our [Contributing Guide](./docs/CONTRIBUTING.md).

docs/user_guide/install.md

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
### Prerequisites
55
- Python>=3.11.0,<3.14.0
6+
- Git and curl
67
- Ubuntu-24.04 for Blackwell GPUs
78

89
Below are installation methods for different operating systems.
@@ -14,36 +15,40 @@ Below are installation methods for different operating systems.
1415
|macOS | ❌️ | ✅️ | ❌️ |
1516

1617
### From Source
17-
#### For Linux/WSL (GPU):
18-
Note: If you are using DGX Spark, please refer to the Docker installation section
18+
19+
The source install script installs `uv` if needed, creates `.venv`, installs
20+
Parallax, and builds the `vllm-rs` frontend binary into `.venv/bin`.
21+
1922
```sh
2023
git clone https://github.com/GradientHQ/parallax.git
2124
cd parallax
22-
pip install -e '.[gpu]'
25+
./install.sh
26+
source .venv/bin/activate
2327
```
2428

25-
#### For macOS (Apple silicon):
26-
27-
We recommend macOS users to create an isolated Python virtual environment before installation.
29+
The script automatically installs `mac` extras on macOS and `gpu` extras on
30+
Linux. You can also choose explicitly:
2831

2932
```sh
30-
git clone https://github.com/GradientHQ/parallax.git
31-
cd parallax
33+
# Linux/WSL GPU
34+
./install.sh --extras gpu
3235

33-
# Enter Python virtual environment
34-
python3 -m venv ./venv
35-
source ./venv/bin/activate
36-
37-
pip install -e '.[mac]'
36+
# macOS Apple silicon
37+
./install.sh --extras mac
3838
```
3939

40-
Next time to re-activate this virtual environment, run ```source ./venv/bin/activate```.
41-
42-
#### Extra step for development:
40+
For development dependencies:
4341
```sh
44-
pip install -e '.[dev]'
42+
./install.sh --extras gpu,dev
43+
# or
44+
./install.sh --extras mac,dev
4545
```
4646

47+
To use a specific supported Python version, pass `--python`, for example
48+
`./install.sh --python 3.12`.
49+
50+
Next time to re-activate this virtual environment, run ```source .venv/bin/activate```.
51+
4752
### Windows Application
4853
[Click here](https://github.com/GradientHQ/parallax_win_cli/releases/latest/download/Parallax_Win_Setup.exe) to get latest Windows installer.
4954

0 commit comments

Comments
 (0)