Skip to content

Commit 308a067

Browse files
authored
Merge pull request #36 from NatLabRockies/optimize
Measure authoring reliability + OS App compatibility + E2E test
2 parents f14c4f7 + c7274fd commit 308a067

47 files changed

Lines changed: 53441 additions & 735 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/measure-authoring/SKILL.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ create_measure(
3434
```
3535
test_measure(measure_dir="/runs/custom_measures/set_lights_8w")
3636
```
37+
Tests run against the currently loaded model (or SystemD_baseline.osm fallback),
38+
so measures that depend on HVAC, plant loops, or zones will work correctly.
39+
Use `model_path` to test against a specific model.
3740

3841
### 3. Apply to Model
3942
```

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ jobs:
7272
;;
7373
3)
7474
# controls, object mgmt, loads, building, doas, hvac, measures, measure_authoring, skill_qaqc, hvac_supply_wiring
75-
FILES="tests/test_component_controls.py tests/test_object_management.py tests/test_generic_access.py tests/test_create_loads.py tests/test_building.py tests/test_doas_system.py tests/test_hvac.py tests/test_measures.py tests/test_measure_authoring.py tests/test_skill_qaqc.py tests/test_hvac_supply_wiring.py"
75+
FILES="tests/test_component_controls.py tests/test_object_management.py tests/test_generic_access.py tests/test_create_loads.py tests/test_building.py tests/test_doas_system.py tests/test_hvac.py tests/test_measures.py tests/test_measure_authoring.py tests/test_skill_qaqc.py tests/test_hvac_supply_wiring.py tests/test_validate_model.py"
7676
EXTRA_ENV=""
7777
;;
7878
4)

CLAUDE.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
## Project: openstudio-mcp
44
MCP server that gives AI agents full control of building energy modeling —
55
create buildings, configure HVAC, run EnergyPlus simulations, and extract
6-
results — all through 134 MCP tools backed by the OpenStudio SDK.
6+
results — all through 138 MCP tools backed by the OpenStudio SDK.
77

88
**Who it's for:** Building energy modelers who want to use AI assistants
99
(Claude, GPT, etc.) to do real modeling work — not just chat about it.
@@ -49,14 +49,15 @@ Always use openstudio-mcp tools for BEM tasks:
4949
| Phase 7 | 📋 FUTURE | Advanced creation (geometry, space type wizard) |
5050
| Phase 8 | ✅ COMPLETE | Bundle common-measures-gem (20 measures, 11 tools: reporting, thermostat, envelope, PV, visualization) |
5151
| Phase 9 | ✅ COMPLETE | AI-assisted measure authoring (3 tools: create, test, edit custom measures) |
52+
| Phase 10 | ✅ COMPLETE | Results & error management (4 tools: errors, variables, compare, validate) |
5253

5354
## Current Skills
5455
| Skill | Tools | Phase |
5556
|-------|-------|-------|
5657
| `server_info` | `get_server_status`, `get_versions` | Phase 1 |
5758
| `model_management` | `create_example_osm`, `create_baseline_osm`, `inspect_osm_summary`, `load_osm_model`, `save_osm_model`, `list_files` | Phase 1 + 2 |
58-
| `simulation` | `validate_osw`, `run_osw`, `run_simulation`, `get_run_status`, `get_run_logs`, `get_run_artifacts`, `cancel_run` | Phase 1 |
59-
| `results` | `extract_summary_metrics`, `read_file`, `copy_file`, `extract_end_use_breakdown`, `extract_envelope_summary`, `extract_hvac_sizing`, `extract_zone_summary`, `extract_component_sizing`, `query_timeseries` | Phase 1 + 9 |
59+
| `simulation` | `validate_osw`, `run_osw`, `run_simulation`, `get_run_status`, `get_run_logs`, `get_run_artifacts`, `cancel_run`, `validate_model` | Phase 1 + 10 |
60+
| `results` | `extract_summary_metrics`, `read_file`, `copy_file`, `extract_end_use_breakdown`, `extract_envelope_summary`, `extract_hvac_sizing`, `extract_zone_summary`, `extract_component_sizing`, `query_timeseries`, `extract_simulation_errors`, `list_output_variables`, `compare_runs` | Phase 1 + 9 + 10 |
6061
| `building` | `get_building_info`, `get_model_summary` | Phase 2 |
6162
| `spaces` | `list_spaces`, `get_space_details`, `list_thermal_zones`, `get_thermal_zone_details`, `create_space`, `create_thermal_zone` | Phase 2 + 3 |
6263
| `geometry` | `list_surfaces`, `get_surface_details`, `list_subsurfaces`, `create_surface`, `create_subsurface`, `create_space_from_floor_print`, `match_surfaces`, `set_window_to_wall_ratio`, `import_floorspacejs` | Phase 2 + 7 |
@@ -77,7 +78,7 @@ Always use openstudio-mcp tools for BEM tasks:
7778
| `measure_authoring` | `list_custom_measures`, `create_measure`, `test_measure`, `edit_measure` | Phase 9 |
7879
| `skill_discovery` | `list_skills`, `get_skill` ||
7980

80-
**Total: 23 skills, 134 MCP tools, ~390 integration tests**
81+
**Total: 23 skills, 138 MCP tools, ~400 integration tests**
8182

8283
## Model Query Pattern
8384
```python
@@ -197,7 +198,7 @@ To add a new SPM type to `set_setpoint_manager_properties`:
197198

198199
## Rules
199200
1. Keep files small where practical — aim for under 250 lines, but don't split artificially just to hit a number
200-
2. Every MCP tool must have a test in `tests/skills/` (Phase 2+) or `tests/` (existing)
201+
2. Every MCP tool must have a test in `tests/skills/` (Phase 2+) or `tests/` (existing). New behavior, bug fixes, and security hardening must also have integration tests — not just the happy path.
201202
3. **Integration tests must be added to `.github/workflows/ci.yml`** — add a new step following the existing pattern
202203
4. Operations return dicts with `{"ok": True/False, ...}` — never raise through MCP
203204
5. Use `openstudio` Python bindings directly.

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,7 @@ List HVAC components via `list_model_objects("BoilerHotWater")`, loop detail too
342342

343343
### Measure Authoring (4 tools)
344344

345-
Create custom OpenStudio measures with AI-generated code, test them, and apply to models. See [Example 1](docs/examples/01_custom_measure_lighting.md) and [Example 2](docs/examples/02_custom_measure_hvac.md).
345+
Create custom OpenStudio measures with AI-generated code, test them, and apply to models. See [Example 1](docs/examples/01_custom_measure_lighting.md), [Example 2](docs/examples/02_custom_measure_hvac.md), and [Example 19](docs/examples/19_systemd_fourpipebeam_retrofit.md) (full E2E retrofit).
346346

347347
| Tool | Description |
348348
|------|-------------|
@@ -424,7 +424,7 @@ The component properties tools can query and modify these 15 HVAC component type
424424

425425
## Examples
426426

427-
18 worked examples with full tool-call sequences — click to expand:
427+
19 worked examples with full tool-call sequences — click to expand:
428428

429429
| # | Example | Description |
430430
|---|---------|-------------|
@@ -446,6 +446,7 @@ The component properties tools can query and modify these 15 HVAC component type
446446
| 16 | [`/new-building`](docs/examples/16_new_building.md) | Full model creation from scratch |
447447
| 17 | [`/retrofit`](docs/examples/17_retrofit.md) | Before/after ECM analysis |
448448
| 18 | [`/view`](docs/examples/18_view.md) | Interactive 3D model visualization |
449+
| 19 | [SystemD Four-Pipe Beam Retrofit](docs/examples/19_systemd_fourpipebeam_retrofit.md) | End-to-end: load 44-zone model, baseline sim, author measure, retrofit sim, compare |
449450

450451
---
451452

docs/examples/02_custom_measure_hvac.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ The AI compares:
8484
- `edit_measure` can fix issues without recreating from scratch
8585
- Chilled beams are AIR TERMINALS (connected via `addBranchForZone`), NOT zone equipment
8686

87-
## Integration Test
87+
## Integration Tests
8888

89-
See `tests/llm/test_04_workflows.py::test_workflow[measure_replace_terminals_full_chain]`
89+
- `tests/llm/test_04_workflows.py::test_workflow[measure_replace_terminals_full_chain]` — explicit prompt with API hints
90+
- `tests/llm/test_04_workflows.py::test_workflow[systemd_fourpipebeam_e2e]` — natural-language prompt on 44-zone model (see [Example 19](19_systemd_fourpipebeam_retrofit.md))
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Example 19: End-to-End Retrofit — Four-Pipe Chilled Beams on Complex Model
2+
3+
Real-world workflow: load an existing 44-zone model, set weather, run baseline, author a custom measure, apply it, re-simulate, and compare results.
4+
5+
## Scenario
6+
7+
A modeler has an existing SystemD building (44 zones, 2 DOAS air loops, CHW/HW/SWH plant loops) and wants to evaluate replacing the constant-volume no-reheat terminals with 4-pipe active chilled beams. They provide a natural-language request with no tool names or API hints.
8+
9+
## Prompt
10+
11+
> I have a model at /inputs/SystemD_baseline.osm. I want you to run the model with Boston-Logan weather file — files are in the same directory. After that, create a measure for me that changes the air terminals to 4-pipe chilled beams, apply that measure to the model, run the model, and compare the results for me. Save the measure in the same location as the model so I have a copy.
12+
13+
## Tool Call Sequence
14+
15+
### Setup and Baseline
16+
17+
```
18+
1. list_skills() # discover available workflows
19+
2. get_skill("measure-authoring") # load measure authoring guide
20+
3. load_osm_model(osm_path="/inputs/SystemD_baseline.osm")
21+
4. get_building_info() # 44 zones, 54K ft2
22+
5. list_air_loops() # 2 DOAS loops (21-zone + 23-zone)
23+
6. change_building_location(weather_file="/inputs/USA_MA_Boston-Logan.Intl.AP.725090_TMY3.epw")
24+
7. save_osm_model(save_path="/runs/SystemD_baseline.osm")
25+
8. run_simulation(osm_path="/runs/SystemD_baseline.osm")
26+
9. get_run_status(run_id=<id>) # poll until complete
27+
10. extract_summary_metrics(run_id=<id>) # baseline EUI: 28.21 kBtu/ft2
28+
```
29+
30+
### Author Measure
31+
32+
```
33+
11. list_plant_loops() # identify CHW + HW loop names
34+
12. create_measure(
35+
name="replace_terminals_with_four_pipe_beams",
36+
description="Replace all air terminals with 4-pipe chilled beams",
37+
language="Ruby",
38+
run_body=<generated by LLM>)
39+
13. test_measure(measure_dir="/runs/custom_measures/replace_terminals_with_four_pipe_beams")
40+
# 2 runs, 2 assertions, 0 failures — tested against loaded model
41+
```
42+
43+
### Apply and Re-simulate
44+
45+
```
46+
14. load_osm_model(osm_path="/inputs/SystemD_baseline.osm")
47+
15. change_building_location(weather_file=...)
48+
16. apply_measure(measure_dir="/runs/custom_measures/replace_terminals_with_four_pipe_beams")
49+
17. list_air_loops() # verify FourPipeBeam terminals
50+
18. save_osm_model(save_path="/runs/SystemD_fourpipebeam.osm")
51+
19. run_simulation(osm_path="/runs/SystemD_fourpipebeam.osm")
52+
20. get_run_status(run_id=<id>) # poll until complete
53+
```
54+
55+
### Compare
56+
57+
```
58+
21. compare_runs(baseline_run_id=<id1>, retrofit_run_id=<id2>)
59+
22. copy_file(file_path="/runs/custom_measures/replace_terminals_with_four_pipe_beams",
60+
destination="/inputs")
61+
```
62+
63+
## Expected Results
64+
65+
| Metric | Baseline | Retrofit | Delta |
66+
|--------|----------|----------|-------|
67+
| EUI (kBtu/ft2) | 28.21 | 28.44 | +0.8% |
68+
| Unmet Hours | 58.5 | 34.5 | -41% |
69+
| Cooling ||| +5.6% |
70+
| Heating ||| +0.9% |
71+
| Pumps ||| -6.1% |
72+
73+
34 zones had CV NoReheat terminals replaced with FourPipeBeam. 0 fatal/severe errors in both simulations. Results are deterministic across reruns.
74+
75+
## Key Differences from Example 2
76+
77+
| Aspect | Example 2 | This Example |
78+
|--------|-----------|--------------|
79+
| Model | 5-zone baseline | 44-zone SystemD (real complexity) |
80+
| Prompt style | Semi-explicit (mentions tool patterns) | Natural language (no tool names) |
81+
| Weather | Pre-configured | Must be set (Boston-Logan EPW) |
82+
| Test model | Generic | Tested against loaded model with HVAC |
83+
| Comparison | Manual extraction | `compare_runs` tool |
84+
| Measure export | Not saved | Copied to /inputs for reuse |
85+
| Tool calls | ~16 | ~35 |
86+
87+
## Common Pitfalls
88+
89+
- **Weather file not found**: The model may reference a different EPW (e.g. Baltimore). `change_building_location` must be called before `run_simulation` and again after `load_osm_model` reload.
90+
- **Plant loop names**: The measure must find loops by name substring ("Chilled Water", "Hot Water") — exact names vary by model.
91+
- **Polling frequency**: `get_run_status` should wait 1-2 minutes between calls. Complex models take 60-140s to simulate.
92+
93+
## Integration Test
94+
95+
See `tests/llm/test_04_workflows.py::test_workflow[systemd_fourpipebeam_e2e]`

docs/plans/cooled-beam-zone-priority.md

Lines changed: 0 additions & 101 deletions
This file was deleted.

0 commit comments

Comments
 (0)