New inference-time approach for Private MedHelm Tasks by sronaghi · Pull Request #3913 · stanford-crfm/helm

sronaghi · 2025-10-22T04:25:14Z

I provide the code for testing a new inference-time approach which involves combining general and clinical domain LMs for some private MedHelm tasks.

I want to test my method on CLEAR, PatientInstruct, and NoteExtract.

To run the models, it involves downloading the following models locally and changing the model paths at the top of the proxy_tuning_client.py file. I can provide a script to download into carina as well. Here are the models and places for download:

Below are the model configurations and the amount of A100 40GB GPUs they use each:

llama-70b-chat_none_none_1.0_logits_20 - 2 GPUs.
mellama-70b-chat_none_none_1.0_logprobs_20 - 2 GPUs.
mellama-13b-chat_none_none_1.0_logprobs_20 - 1 GPU.
qwen3-30b_none_none_1.0_logprobs_20 - 1 GPU.
qwen3-30b_mellama-13b-chat_llama-13b-base_1.0_logprobs_20 - 1 GPU.
llama-70b-chat_mellama-13b-chat_llama-13b-base_1.0_logits_20 - 3 GPUs.
qwen3-30b_mellama-13b-chat_none_1.0_logprobs_20 - 1 GPU.

I have added each model configuration to model_metadata.yaml, model_deployments.yaml, and tokenizer_config.yaml files in both prod_env and src/helm/config. run_entries_medhelm_private_proxy_tuning.conf contains the model run entries for each task. I can also create separate conf files based on amount of GPUs needed.

Each model for each task takes me ~7-22 hours each. I run the models with -n = 1 flag as my code doesn't support multi-threading.

I ended up using basic_summarization_metrics because I couldn't configure what was needed in my helm_env while maintaining compatibility with my code. If there are conda environment issues, I can share my env file and the modified run_specs.

sronaghi · 2025-10-22T04:25:49Z

@MiguelAFH @suhana13 @aunell @HennyJie

prod_env/model_deployments.yaml

src/helm/benchmark/run_specs/medhelm_run_specs.py

MiguelAFH · 2025-10-22T18:09:04Z

src/helm/benchmark/presentation/run_entries_medhelm_private_proxy_tuning.conf

@@ -0,0 +1,192 @@
+# MedHELM RunSpecs for the private benchmarks from Stanford.


@yifanmai what are your thoughts on adding this file?

src/helm/clients/proxy_tuning_client.py

…oxy_tuning.conf

sronaghi

I've made edits based on @MiguelAFH's comments.

…es_medhelm_private_proxy_tuning.conf

yifanmai

In general:

The files need more documentation, which can be placed as a module level docstring in proxy_tuning_client.py, in the comment in model_metadata.yaml andmodel_deployments.yaml, and in the comment on top of run_entries_medhelm_private_proxy_tuning.conf.
If this is experimental code, rather than intended for general use, your documentation should clearly say so.
Please run the linter:

pip install black==24.3.0 mypy==1.16.0 flake8==5.0.4
./pre-commit.sh

I did not look at your model code too closely, let me know if there's any specific things you would like me to look at.

src/helm/clients/proxy_tuning_client.py

src/helm/config/tokenizer_configs.yaml

src/helm/config/model_deployments.yaml

This addition allows for proxy tuning class to run for MedHelm scenarios. After creating conda environment, only need to run pip install -U "crfm-helm[proxy_tuning]"

sronaghi · 2025-10-29T20:53:02Z

@yifanmai @MiguelAFH @aunell @suhana13 @HennyJie I ran the formatting check and added documentation. Please let me know what else to do for this PR!

sronaghi added 14 commits October 15, 2025 15:24

Update model_metadata.yaml

d2871fc

Update model_metadata.yaml

4d05a27

Update model_deployments.yaml

c1246b5

Update tokenizer_configs.yaml

e30447a

Create proxy_tuning_client.py

2be5a72

Add files via upload

c33bf29

Delete src/helm/clients/proxy_tuning_client.py

93f2a46

Rename proxy_tuning_client (2).py to proxy_tuning_client.py

bea4a9d

Add files via upload

fc4d6b1

Add files via upload

430fb51

Add files via upload

622027e

Update model_deployments.yaml

a700380

Update model_metadata.yaml

30b2040

Update tokenizer_configs.yaml

4bfdfdc

Update medhelm_run_specs.py

53ec6a5

MiguelAFH requested changes Oct 22, 2025

View reviewed changes

sronaghi added 5 commits October 22, 2025 12:09

Update proxy_tuning_client.py

f1138ad

Delete prod_env/model_deployments.yaml

5909035

Delete prod_env/model_metadata.yaml

7c1c4a0

Delete src/helm/benchmark/presentation/run_entries_medhelm_private_pr…

47d9aaa

…oxy_tuning.conf

Delete prod_env/tokenizer_configs.yaml

bf0181e

sronaghi commented Oct 22, 2025

View reviewed changes

sronaghi added 2 commits October 22, 2025 12:23

Add files via upload

3809df5

Rename run_entries_medhelm_private_proxy_tuning (1).conf to run_entri…

1b6398b

…es_medhelm_private_proxy_tuning.conf

MiguelAFH requested a review from yifanmai October 23, 2025 20:07

Update proxy_tuning_client.py

3a450ba

yifanmai requested changes Oct 24, 2025

View reviewed changes

yifanmai reviewed Oct 24, 2025

View reviewed changes

src/helm/clients/proxy_tuning_client.py Show resolved Hide resolved

yifanmai requested changes Oct 24, 2025

View reviewed changes

src/helm/config/tokenizer_configs.yaml Outdated Show resolved Hide resolved

src/helm/config/model_deployments.yaml Outdated Show resolved Hide resolved

sronaghi added 15 commits October 28, 2025 12:20

Update medhelm_run_specs.py

1cde06a

Update clear_scenario.py

5d92c1b

Update clear_scenario.py

74bb784

Update medhelm_run_specs.py

e22366d

Update tokenizer_configs.yaml

b089298

Update model_deployments.yaml

396bbee

Update tokenizer_configs.yaml

b24bda5

Update proxy_tuning_client.py

b4ce130

Update proxy_tuning_client.py

db0ac7e

Update run_entries_medhelm_private_proxy_tuning.conf

0eb706f

Update model_deployments.yaml

ff38c74

Update model_metadata.yaml

7ee25c6

Update model_metadata.yaml

15398f6

Update proxy_tuning_client.py

9856941

Update pyproject.toml

8a2bb7e

This addition allows for proxy tuning class to run for MedHelm scenarios. After creating conda environment, only need to run pip install -U "crfm-helm[proxy_tuning]"

Update proxy_tuning_client.py

2e60d94

MiguelAFH self-requested a review October 31, 2025 17:45

MiguelAFH approved these changes Oct 31, 2025

View reviewed changes

sronaghi added 3 commits October 31, 2025 11:03

Update proxy_tuning_client.py

19d24b0

Update proxy_tuning_client.py

4577725

Update proxy_tuning_client.py

fa38ee2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New inference-time approach for Private MedHelm Tasks#3913

New inference-time approach for Private MedHelm Tasks#3913
sronaghi wants to merge 42 commits intostanford-crfm:mainfrom
sronaghi:main

sronaghi commented Oct 22, 2025 •

edited

Loading

Uh oh!

sronaghi commented Oct 22, 2025

Uh oh!

Uh oh!

Uh oh!

MiguelAFH Oct 22, 2025

Uh oh!

sronaghi Oct 28, 2025

Uh oh!

Uh oh!

sronaghi left a comment

Uh oh!

yifanmai left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sronaghi commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,192 @@
		# MedHELM RunSpecs for the private benchmarks from Stanford.

Conversation

sronaghi commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sronaghi commented Oct 22, 2025

Uh oh!

Uh oh!

Uh oh!

MiguelAFH Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

sronaghi Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sronaghi left a comment

Choose a reason for hiding this comment

Uh oh!

yifanmai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sronaghi commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sronaghi commented Oct 22, 2025 •

edited

Loading