This script generates all possible permutations of the personalization context used in the planNudges.ts function and captures the LLM responses for analysis. It supports multiple model backends including OpenAI API, SecureGPT, and local models via a Python service (MLX and non-MLX Hugging Face models).
This is an overview of what can be tested:
- genderIdentity: 'male', 'female'
- ageGroup: '<35', '35-50', '51-65', '>65'
- disease: null, 'Heart failure', 'Pulmonary arterial hypertension', 'Diabetes', 'ACHD (simple)', 'ACHD (complex)'
- stageOfChange: null, 'Precontemplation', 'Contemplation', 'Preparation', 'Action', 'Maintenance'
- educationLevel: 'Highschool', 'college', 'collage'
- language: 'en', 'es'
- preferredWorkoutTypes: 'run,walk', 'HIIT,strength', 'swim,bicycle', 'yoga/pilates,walk', 'sport,run,strength', 'other', 'other,walk,run', 'other,HIIT,walk,swim,run,sport,strength,bicycle,yoga/pilates'
- preferredNotificationTime: '7:00 AM', '12:00 PM', '6:00 PM'
Total permutations: 2 × 4 × 6 × 6 × 3 × 2 × 8 × 3 = 13,824 combinations
cd assets/scripts/nudge-testing
npm installFor OpenAI models:
export OPENAI_API_KEY="your-api-key-here"For SecureGPT models:
export SECUREGPT_API_KEY="your-securegpt-api-key-here"Note: The machine must be connected to the full Stanford VPN for SecureGPT API to work.
If you want to test local models, you need to set up the Python service:
-
Install Python dependencies:
cd python_service pip install -r requirements.txt -
Notes:
- MLX models require Apple Silicon (M1/M2/M3) Mac.
- Non-MLX Hugging Face models run through
transformersand use GPU when available (cudaormps), otherwise CPU. - Models will be automatically downloaded from Hugging Face on first use.
-
Start the Python service:
npm run start:python-service
Or manually:
cd python_service python huggingface_service.pyThe service will run on
http://localhost:8000by default.
By default, the script uses OpenAI models (backward compatible):
npm run test # Full test with all permutations
npm run test:sample # Sample test with 10 permutations
npm run test:random # Random sample of 10 permutationsTo test local Python models, first ensure the Python service is running, then:
# Test local Python models with sample
npm run test:huggingface
# Or manually specify provider
npm run build && node dist/generateNudgePermutations.js --provider huggingface --sample 5To test SecureGPT models (GPT-5, Gemini 2.5 Pro, etc.):
# Test SecureGPT GPT-5
npm run build && node dist/generateNudgePermutations.js --model gpt-5 --sample 5
# Test SecureGPT Gemini 2.5 Pro
npm run build && node dist/generateNudgePermutations.js --model gemini-2.5-pro --sample 5
# Test all SecureGPT models
npm run build && node dist/generateNudgePermutations.js --provider securegpt --sample 10To test all available models:
npm run build && node dist/generateNudgePermutations.js --provider all --sample 10The script supports the following CLI arguments:
--sample <number>- Test a specific number of permutations (default: 10 if not specified)--random- Randomly select permutations instead of sequential--model <model-id>- Test a single specific model (e.g.,--model mlx-community/Llama-3.2-1B-Instruct-4bit)--models <id1,id2,...>- Test multiple specific models (comma-separated)--provider <provider-name|all>- Filter by provider:openai- Only OpenAI modelshuggingface- Local Hugging Face models (MLX + non-MLX) via the Python servicesecuregpt- Only SecureGPT models (GPT-5, Gemini 2.5 Pro, etc.)all- All available models (default if provider is specified)
--python-service-url <url>- Override Python service URL (default: http://localhost:8000)--output <dir>- Write results CSV to a custom output directory (the script still auto-generates the filename inside this directory)--require-stage-of-change- Only generate/test permutations wherestageOfChangeis provided (non-null)--require-comorbidity- Only generate/test permutations where a disease/comorbidity is provided (non-null)--contexts-json <path>- Load explicit patient contexts from a JSON array (uses provided contexts instead of generated permutations/sample mode)--timeout <seconds>- Override default generation timeout (default: 60s)
Prompt-generation instruction text and context snippets are now centralized in:
config/prompts/prompt_constants.v1.jsonconfig/prompts/prompt_constants.schema.json
Both the TypeScript generator (src/generateNudgePermutations.ts) and Python curation script (scripts/nudge-curation-v1/patient_context_curation_script_v1.py) load this same prompt specification to avoid string drift.
# Test specific MLX model with 5 permutations
npm run build && node dist/generateNudgePermutations.js --model mlx-community/SmolLM2-360M-Instruct --sample 5
# Test MHC-Coach (non-MLX Hugging Face model) with 1 permutation
npm run build && node dist/generateNudgePermutations.js --model SriyaM/MHC-Coach --sample 1
# Test multiple models
npm run build && node dist/generateNudgePermutations.js --models "gpt-5.2-2025-12-11,mlx-community/Llama-3.2-1B-Instruct-4bit" --sample 10
# Test all HuggingFace models with random sampling
npm run build && node dist/generateNudgePermutations.js --provider huggingface --sample 20 --random
# Custom Python service URL
npm run build && node dist/generateNudgePermutations.js --provider huggingface --python-service-url http://localhost:9000
# Write output CSV to a custom directory
npm run build && node dist/generateNudgePermutations.js --provider all --sample 10 --output ./data/custom-results
# Test SecureGPT GPT-5
npm run build && node dist/generateNudgePermutations.js --model gpt-5 --sample 5
# Test SecureGPT Gemini 2.5 Pro
npm run build && node dist/generateNudgePermutations.js --model gemini-2.5-pro --sample 5
# Require stage of change and comorbidity in all tested permutations
npm run build && node dist/generateNudgePermutations.js --provider all --sample 10 --require-stage-of-change --require-comorbidity
# Use curated contexts from JSON (bypasses --sample/--random generation)
npm run build && node dist/generateNudgePermutations.js --provider all --contexts-json scripts/nudge-curation-v1/patient_contexts_seed42.jsongpt-5.2-2025-12-11- GPT-5.2 (default)
gpt-5- SecureGPT GPT-5gpt-5-mini- SecureGPT GPT-5 Minigpt-5-nano- SecureGPT GPT-5 Nanogemini-2.5-pro- SecureGPT Gemini 2.5 Pro
mlx-community/Llama-3.2-1B-Instruct-4bitmlx-community/Llama-3.2-3B-Instruct-4bitmlx-community/Phi-4-mini-instruct-4bitmlx-community/gemma-3-270m-it-4bitmlx-community/gemma-3-1b-it-qat-4bitmlx-community/gemma-3-4b-it-qat-4bitmlx-community/Qwen2.5-0.5B-Instruct-4bitmlx-community/Qwen2.5-1.5B-Instruct-4bitmlx-community/Qwen2.5-3B-Instruct-4bitmlx-community/Qwen3-4B-Instruct-2507-4bitmlx-community/DeepSeek-R1-Distill-Qwen-1.5B-4bitmlx-community/Ministral-3-3B-Instruct-2512-4bitmlx-community/SmolLM2-360M-Instructmlx-community/SmolLM2-1.7B-Instructmlx-community/SmolLM3-3B-4bit
SriyaM/MHC-Coach
The script saves results to CSV files with descriptive filenames. By default it writes to data/generated; use --output <dir> to override the directory.
nudge_permutations_results_<model-ids>_sample_<number>.csv- Sample resultsnudge_permutations_results_<model-ids>_sample_<number>_random.csv- Random sample resultsnudge_permutations_results_<model-ids>_full.csv- Full permutation results
The CSV output includes the following columns:
modelId- The model identifier usedprovider- The model provider (e.g., 'openai', 'huggingface')backendType- The backend type (same as provider)genderIdentity- The gender identity value usedageGroup- The age group testeddisease- The disease condition (empty if null)stageOfChange- The stage of change (empty if null)educationLevel- The education levellanguage- The language ('en' or 'es')preferredNotificationTime- The preferred notification timegenderContext- The generated gender context textageContext- The generated age context textdiseaseContext- The generated disease context textstageContext- The generated stage of change context texteducationContext- The generated education context textlanguageContext- The generated language context textnotificationTimeContext- The generated notification time context textfullPrompt- The complete prompt sent to the LLMllmResponse- The raw JSON response from the LLMlatencyMs- Generation latency in millisecondserror- Any error message if the API call failed
The script uses a backend abstraction layer that supports multiple model providers:
- OpenAIBackend - Handles OpenAI API calls
- SecureGPTBackend - Handles SecureGPT API calls
- HuggingFacePythonBackend - Communicates with Python local model service via HTTP (MLX + Transformers)
- BackendFactory - Creates appropriate backend instances
This architecture makes it easy to add new model providers in the future (e.g., Claude, Gemini).
If you see a warning about the Python service not being available:
- Ensure the Python service is running:
npm run start:python-service - Check that the service URL is correct (default: http://localhost:8000)
- Verify Python dependencies are installed:
pip install -r python_service/requirements.txt
If a specific model fails to load:
- The script will log the error and continue with other models
- Check the error message in the CSV output
- For MLX models, ensure you have Apple Silicon Mac and sufficient memory
- For non-MLX local models, ensure
torchandtransformersare installed and you have enough memory/VRAM
If you encounter timeout errors:
- Increase the timeout using
--timeout <seconds> - Default timeout is 60s for models <1B, 120s for larger models
- Larger models may need more time, especially on first load
- Models are tested sequentially (one at a time) to avoid resource contention
- Local Python models are loaded on-demand in the Python service (not cached)
- The script maintains backward compatibility - default behavior uses OpenAI only
- First request for each local model may be slower as the model needs to be downloaded/loaded