Skip to content

Commit 7a71334

Browse files
committed
Update docs for reranker
1 parent 97bbb79 commit 7a71334

3 files changed

Lines changed: 85 additions & 13 deletions

File tree

docs/docs/components/clients/consistentmi.md

Lines changed: 36 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ ConsistentMI simulates clients in motivational interviewing (MI) sessions with c
2424

2525
1. **Load Profile**: Reads the character JSON (personas, beliefs, acceptable plans, motivation topics) and initializes `stage` and `receptivity`.
2626
2. **Initialize Prompts**: Builds a system prompt that anchors the client’s behavior/goal and injects personas + beliefs for consistency.
27-
3. **Track Topic Engagement**: Matches the therapist’s latest utterance to a motivation topic, then uses the topic graph distance to update `engagement` and count repeated off-topic turns.
27+
3. **Track Topic Engagement**: Matches the therapist’s latest utterance to a motivation topic using a reranker-backed topic matcher, then uses the topic graph distance to update `engagement` and count repeated off-topic turns. If reranking is unavailable or returns no valid scores, ConsistentMI falls back to lexical matching.
2828
4. **Verify Motivation (Optional)**: If the therapist addresses the client’s core motivation, the client enters a short `Motivation` state for an acknowledging response.
2929
5. **Sample a Stage-Consistent Action**: An LLM predicts an action distribution conditioned on recent context and the current stage.
3030
6. **Select Grounding Detail**: For actions like `Inform/Downplay/Blame/Hesitate/Plan`, the client selects a relevant persona/belief/plan (only when the therapist asks a question) to ground the next reply.
@@ -58,16 +58,43 @@ response = client.generate_response(
5858
print(response)
5959
```
6060

61+
> ⚠️ **Hint:**
62+
>
63+
> - ConsistentMI use a local reranker served through vLLM's OpenAI-compatible `/rerank` endpoint.
64+
> - Set `LOCAL_BASE_URL` and `LOCAL_API_KEY` in `.env`; PatientHub reuses them for the reranker.
65+
> - Use `reranker_model_type=LOCAL`.
66+
> - Set `reranker_model_name` to the LiteLLM vLLM route, e.g. `hosted_vllm/BAAI/bge-reranker-v2-m3`.
67+
> - If the reranker server runs on the same machine, prefer `127.0.0.1` over `0.0.0.0` in `LOCAL_BASE_URL`.
68+
6169
## Configuration
6270

63-
| Option | Description | Default |
64-
| ------------------ | -------------------------------- | ---------------------------------------------- |
65-
| `prompt_path` | Path to prompt file | `data/prompts/client/consistentMI.yaml` |
66-
| `data_path` | Path to character file | `data/characters/ConsistentMI.json` |
67-
| `data_idx` | Character index | `0` |
68-
| `topics_path` | Topics from Wiki | `data/resources/ConsistentMI/topics.json` |
69-
| `topic_graph_path` | Correlation between topics | `data/resources/ConsistentMI/topic_graph.json` |
70-
| `model_retriever` | retrieve the most relevant topic | None |
71+
| Option | Description | Default |
72+
| --------------------- | ------------------------------------------------------- | ---------------------------------------------- |
73+
| `prompt_path` | Path to prompt file | `data/prompts/client/consistentMI.yaml` |
74+
| `data_path` | Path to character file | `data/characters/ConsistentMI.json` |
75+
| `data_idx` | Character index | `0` |
76+
| `topics_path` | Topics from Wiki | `data/resources/ConsistentMI/topics.json` |
77+
| `topic_graph_path` | Correlation between topics | `data/resources/ConsistentMI/topic_graph.json` |
78+
| `reranker_model_type` | Provider key for topic reranking | `LOCAL` |
79+
| `reranker_model_name` | LiteLLM model route for the reranker | `hosted_vllm/BAAI/bge-reranker-v2-m3` |
80+
81+
### Local Reranker Example
82+
83+
```yaml
84+
client:
85+
agent_name: consistentMI
86+
model_type: OPENAI
87+
model_name: gpt-4o
88+
reranker_model_type: LOCAL
89+
reranker_model_name: hosted_vllm/BAAI/bge-reranker-v2-m3
90+
```
91+
92+
With a local vLLM reranker server, your `.env` should contain:
93+
94+
```bash
95+
LOCAL_BASE_URL=http://127.0.0.1:7891/v1
96+
LOCAL_API_KEY=EMPTY
97+
```
7198

7299
## Character Data Format
73100

docs/docs/getting-started/configuration.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,11 +22,13 @@ For example,
2222
OPENAI_API_KEY=your_openai_key
2323
OPENAI_BASE_URL=https://api.openai.com
2424

25-
# For VLLM (n this case, model_type = VLLM)
26-
VLLM_BASE_URL=http://127.0.0.1
27-
VLLM_API_KEY=None
25+
# For local OpenAI-compatible servers (model_type = LOCAL)
26+
LOCAL_BASE_URL=http://127.0.0.1:8000/v1
27+
LOCAL_API_KEY=EMPTY
2828
```
2929

30+
`model_type` is used to select the environment-variable namespace. For example, `model_type=LOCAL` makes PatientHub read `LOCAL_BASE_URL` and `LOCAL_API_KEY`.
31+
3032
## Model Configuration
3133

3234
### Using OpenAI (Default)
@@ -65,9 +67,14 @@ config = {
6567
```yaml
6668
client:
6769
agent_name: consistentMI
68-
initial_stage: precontemplation # precontemplation, contemplation, preparation, action
70+
model_type: OPENAI
71+
model_name: gpt-4o
72+
reranker_model_type: LOCAL
73+
reranker_model_name: hosted_vllm/BAAI/bge-reranker-v2-m3
6974
```
7075
76+
`ConsistentMI` uses the main `model_type` / `model_name` pair for response generation and a separate `reranker_model_type` / `reranker_model_name` pair for topic matching. The reranker currently reuses `LOCAL_BASE_URL` and `LOCAL_API_KEY`.
77+
7178
#### SimPatient
7279

7380
```yaml

docs/docs/getting-started/installation.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,44 @@ LOCAL_API_KEY=EMPTY
7575

7676
Then set your config to use `model_type=LOCAL` and `model_name` to the model name exposed by your vLLM server.
7777

78+
### Local Reranker Models via vLLM
79+
80+
`ConsistentMI` can also use a local reranker served by vLLM's OpenAI-compatible `/rerank` endpoint.
81+
82+
1) Start a reranker model with vLLM:
83+
84+
```bash
85+
vllm serve BAAI/bge-reranker-v2-m3 --host 0.0.0.0 --port 7891
86+
```
87+
88+
2) Point `LOCAL_BASE_URL` at the reranker server:
89+
90+
```bash
91+
LOCAL_BASE_URL=http://127.0.0.1:7891/v1
92+
LOCAL_API_KEY=EMPTY
93+
```
94+
95+
3) Use the LiteLLM vLLM route in `ConsistentMI`:
96+
97+
```yaml
98+
client:
99+
agent_name: consistentMI
100+
reranker_model_type: LOCAL
101+
reranker_model_name: hosted_vllm/BAAI/bge-reranker-v2-m3
102+
```
103+
104+
:::tip Localhost vs 0.0.0.0
105+
Use `0.0.0.0` for the server listen address, but use `127.0.0.1` or the machine's real IP in `LOCAL_BASE_URL`.
106+
:::
107+
108+
:::tip Proxy settings
109+
If your shell exports `http_proxy` or `https_proxy`, local requests to the reranker can be sent to the proxy instead of your vLLM server. For local testing, either unset those variables or set:
110+
111+
```bash
112+
export NO_PROXY=127.0.0.1,localhost
113+
```
114+
:::
115+
78116
:::note vLLM fails to start
79117
it’s usually a CUDA/driver mismatch on the serving machine—check your NVIDIA driver/CUDA runtime and use a vLLM version compatible with your environment.
80118
:::

0 commit comments

Comments
 (0)