always be brutally honest
MCP server giving AI agents full control of building energy modeling — create buildings, author measures, configure HVAC, run EnergyPlus sims, extract results — all through 142 MCP tools backed by the OpenStudio SDK.
Always use openstudio-mcp tools for BEM tasks:
- Never generate raw IDF files
- OSM files are created/modified only through MCP tools (create_typical_building, create_new_building, etc)
- Never write Python/Ruby/others scripts to parse SQL results, create visualizations, build HVAC wiring, or extract data — equivalent MCP tools already exist (extract_*, query_timeseries, view_model, view_simulation_data, add_baseline_system, etc.)
- If a task genuinely cannot be done with existing tools, ASK THE USER before writing any code or scripts
- For workflow guidance, run:
list_skills()orget_skill("new-building")
- Keep files under ~250 lines — don't split artificially just to hit a number
- Every MCP tool must have an integration test. New behavior, bug fixes, and security hardening need tests too — not just the happy path
- Integration tests must be added to
.github/workflows/ci.yml— append to the lightest shard'sFILES=list (5 shards, keep balanced ~200s each) - Follow testing rules in
.claude/rules/testing.md. Critical: every test needs# Regression:or# Validates:comment; never delete failing tests or weaken assertions; assert exact values not existence; integration tests mock nothing; unit tests never importopenstudio - Operations return
{"ok": True/False, ...}— never raise through MCP - Use
openstudioPython bindings directly - All OpenStudio attribute access must handle
is_initialized()checks _extract_*functions return dicts withsnake_casekeys matching OpenStudio attribute names- Tool functions keep
_toolsuffix internally; MCP-visible names strip it via@mcp.tool(name="...") - Never commit generated/temp files —
.gitignorecovers__pycache__/,*.pyc,runs/,.claude/,.pytest_cache/. Test artifacts go toruns/. Only permanent reference models go intests/assets/ - Bundled measures get wrapper tools with typed args — don't expose raw
apply_measureas primary interface - No
getattr()or string-based dispatch — every OpenStudio API method called directly (grepable, lintable, visible in stack traces) - MCP clients may send
list[str]as JSON strings — uselist[str] | strtype annotation +parse_str_list()fromosm_helpers.py
- Each skill lives in
mcp_server/skills/<name>/ tools.pyexportsregister(mcp)— MCP tool definitions onlyoperations.py— business logic, returns plain dicts, no MCP awarenessSKILL.md— skill definition for LLM context- Key modules:
model_manager.py(load/get/save/clear model),osm_helpers.py(fetch_object, optional_name, list_all_as_dicts),skills/__init__.py(auto-discovers all skills)
Two real classes of stdout pollution corrupt MCP JSON-RPC — two-layer defense at startup in server.py::main() before mcp.run().
- Class A — SWIG memleak warnings (interpreter shutdown):
"swig/python detected a memory leak of type 'boost::optional< ... > *'". PyPIopenstudio==3.11.0wheel built WITHOUTSWIG_PYTHON_SILENT_MEMLEAK. Upstream SWIG#2638 / OpenStudio#5421; fix #5422 applied to .deb only, not the wheel (filed as NatLabRockies/OpenStudio#5608). - Class B — OpenStudio Logger Polyhedron/Space (during ops):
[utilities.Polyhedron]/[openstudio.model.Space]warnings on stdout fromSpace::volume()/floorArea()on imperfect geometry. DefaultstandardOutLoggersink runs at Warn level → C stdout. stdout_suppression.py::silence_openstudio_stdout_logger()— primary fix for Class B. Callsopenstudio.Logger.instance().standardOutLogger().setLogLevel(openstudio.Fatal). Uses intended Logger API, no fd manipulation.stdout_suppression.py::redirect_c_stdout_to_stderr()— backstop for Class A + unknowns. Permanently dups fd 1 → stderr; Pythonsys.stdoutgets a private fd to the real MCP client pipe.cbe6399-style claims that FourPipeBeam /add_baseline_systememit stdout do NOT reproduce — per-call wrappers are no-ops now.suppress_openstudio_warnings()retained as no-op for import compat- No action needed for new skills
docker build -f docker/Dockerfile -t openstudio-mcp:dev .Run all tests (single container, fastest, matches CI):
docker run --rm \
-v "C:/projects/openstudio-mcp:/repo" \
-v "C:/projects/openstudio-mcp/runs:/runs" \
-e RUN_OPENSTUDIO_INTEGRATION=1 \
-e MCP_SERVER_CMD=openstudio-mcp \
openstudio-mcp:dev bash -lc "cd /repo && pytest -vv tests/test_*.py"Run specific test file:
docker run --rm \
-v "C:/projects/openstudio-mcp:/repo" \
-v "C:/projects/openstudio-mcp/runs:/runs" \
-e RUN_OPENSTUDIO_INTEGRATION=1 \
-e MCP_SERVER_CMD=openstudio-mcp \
openstudio-mcp:dev bash -lc "cd /repo && pytest -vv tests/test_load_save_model.py"- Targeted:
LLM_TESTS_ENABLED=1 pytest tests/llm/test_06_progressive.py -k "thermostat_L1" -v - Full suite only for final validation
- Markers:
-m smoke(12),-m generic(10),-m progressive(102) - Benchmark results go in
docs/testing/llm-test-benchmark.md
- Lint:
ruff check mcp_server/ - Unit tests (no Docker):
pytest tests/test_skill_registration.py -v
- Integration tests require Docker and OpenStudio
- Use
C:/Windows-style paths for Docker volume mounts (MSYS/c/paths don't resolve dotfile dirs) - Tests create temporary models in
runs/(mounted as/runsin container) - After builds, prune dangling images:
docker image prune -f