Skip to content

Commit 6af1840

Browse files
authored
Merge pull request #10 from joonsoome/rerank-model-use-update
Documentation Update
2 parents 87eae49 + 7913818 commit 6af1840

File tree

4 files changed

+52
-16
lines changed

4 files changed

+52
-16
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,3 +221,4 @@ tools/test-results/
221221

222222
# Test
223223
.env
224+
pyproject.toml

README.md

Lines changed: 51 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
# 🔥 Embeddings + Reranking on your Mac (MLX‑first)
22

3-
<p>
3+
<p align="center">
44
<a href="docs/ENHANCED_OPENAI_API.md">
5+
<a href="https://github.com/joonsoo-me/embed-rerank/blob/main/LICENSE"><img src="https://img.shields.io/github/license/joonsoo-me/embed-rerank?logo=opensource&logoColor=white" /></a>
56
<img src="https://img.shields.io/badge/OpenAI%20rerank-supported-2ea44f" alt="OpenAI rerank supported (/v1/openai/rerank)" />
67
</a>
78
<a href="docs/DEPLOYMENT_PROFILES.md">
89
<img src="https://img.shields.io/badge/auto--sigmoid-default%20on-blue" alt="auto-sigmoid default on" />
9-
</a>
10+
</a><a href="https://ml-explore.github.io/mlx/"><img src="https://img.shields.io/badge/MLX-Optimized-green?logo=apple&logoColor=white" /></a>
11+
<a href="https://fastapi.tiangolo.com/"><img src="https://img.shields.io/badge/FastAPI-009688?logo=fastapi&logoColor=white" /></a>
1012
<a href="https://pypi.org/project/embed-rerank/">
1113
<img src="https://img.shields.io/pypi/v/embed-rerank?logo=pypi&logoColor=white" alt="PyPI Version" />
1214
</a>
@@ -16,6 +18,46 @@ Blazing‑fast local embeddings and true cross‑encoder reranking on Apple Sili
1618

1719
This page is a beginner‑friendly quick start. Detailed guides live in docs/.
1820

21+
## 🌐 Four APIs, One Service
22+
23+
| API | Endpoint | Use Case |
24+
|-----|----------|----------|
25+
| **Native** | `/api/v1/embed`, `/api/v1/rerank` | New projects |
26+
| **OpenAI** | `/v1/embeddings`, `/v1/openai/rerank` (alias: `/v1/rerank_openai`) | Existing OpenAI code |
27+
| **TEI** | `/embed`, `/rerank`, `/info` | Hugging Face TEI replacement |
28+
| **Cohere** | `/v1/rerank`, `/v2/rerank` | Cohere API replacement |
29+
| | `/docs` `/health` | More info. |
30+
31+
## 📈 Performance Visualization
32+
33+
### Latency Comparison (Projected)
34+
35+
```
36+
Single Text Embedding Latency (milliseconds)
37+
38+
Apple MLX ████ 0.2ms
39+
PyTorch MPS ████████████████████████████████████████████████ 45ms
40+
PyTorch CPU ████████████████████████████████████████████████████████████████████████████████████████████████████████ 120ms
41+
CUDA (Est.) ████████████ 12ms
42+
Vulkan (Est.) ████████████████████████ 25ms
43+
44+
0ms 25ms 50ms 75ms 100ms 125ms
45+
```
46+
47+
### Throughput Comparison (texts/second)
48+
49+
```
50+
Maximum Throughput (texts per second)
51+
52+
Apple MLX ████████████████████████████████████████████████████████████████████████████████████████████████████████ 35,000
53+
CUDA (Est.) ████████████████████████████████ 8,000
54+
PyTorch MPS ██████ 1,500
55+
Vulkan (Est.) ████████████ 3,000
56+
PyTorch CPU ██ 500
57+
58+
0 10k 20k 30k 40k
59+
```
60+
1961
## 🚀 Start here (60 seconds)
2062

2163
1) Install and run (embeddings only)
@@ -96,12 +138,6 @@ Notes
96138
- Scores may be auto‑sigmoid‑normalized for OpenAI clients by default (disable via `OPENAI_RERANK_AUTO_SIGMOID=false`).
97139
- The root endpoint `/` shows both `embedding_dimension` (served) and `hidden_size` (model config) for clarity.
98140

99-
Quick endpoints reference
100-
- Native: `/api/v1/embed`, `/api/v1/rerank`
101-
- OpenAI: `/v1/embeddings`, `/v1/openai/rerank` (alias: `/v1/rerank_openai`)
102-
- TEI: `/embed`, `/rerank`, `/info`
103-
- Cohere: `/v1/rerank`, `/v2/rerank`
104-
105141
Run the full validation suite
106142
```bash
107143
./tools/server-tests.sh --full
@@ -142,6 +178,13 @@ rr = client._request(
142178
print(rr.get("results", rr))
143179
```
144180

181+
## Tested Frameworks
182+
| | Framework | Tests |
183+
|---|---|---|
184+
|| [**Open WebUI**](https://github.com/open-webui/open-webui) | `Embed` |
185+
|| [**LightRAG**](https://github.com/HKUDS/LightRAG) | `Embed` `Rerank` |
186+
###### We are waiting for your reports!
187+
145188
## 📄 License
146189

147190
MIT License – build amazing things locally.

debug_test.py

Whitespace-only changes.

openai_quick_test.py

Lines changed: 0 additions & 8 deletions
This file was deleted.

0 commit comments

Comments
 (0)