This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
gdxpds translates between GDX (GAMS Data eXchange) files and pandas DataFrames. GDX is the binary file format used by GAMS, a mathematical optimization modeling system. Two entry points:
- High-level functions:
to_dataframes(),to_dataframe(),list_symbols(),get_data_types(),to_gdx()— exposed at package top level. - Object-oriented API:
GdxFileandGdxSymbolin src/gdxpds/gdx.py for programmatic, lazy access.
This package cannot function without a GAMS installation — there is no mock layer. The SWIG-bound GDX bindings talk to the GAMS shared library found at runtime, and are imported lazily (on the first GDX operation), so import gdxpds itself does not need a binding. Two equivalent binding sources are supported via try/except imports (inside the engine modules and the lazy-load helpers, not at package import):
- Modern (recommended):
from gams.core import gdx as gdxcc— shipped insidegamsapi, which the user installs version-matched to their GAMS install (pip install gamsapi[transfer]==xx.y.z). Not a base dependency of gdxpds. - Legacy:
import gdxcc— the standalone PyPI package. Available via the[legacy]extra (pip install gdxpds[legacy]). Older but the SWIG C ABI is stable enough that it still works.
Other runtime notes:
- GAMS lookup order is implemented by
GamsDirFinderin src/gdxpds/tools.py:GAMS_DIRenv var →GAMSDIRenv var →where gams/which gams→ walk default install location (C:\GAMSon Windows; picks highest version). The Windows walk handles both the modernC:\GAMS\<version>\layout and the legacyC:\GAMS\win64\<version>\layout by looking forgams.exeto identify a GAMS root. GAMS_DIRremains mandatory at runtime even with pip-installed bindings, because the GDX shared library lives in the GAMS install directory, not in the wheel. The recommended pattern is one venv per GAMS install with$Env:GAMS_DIRpinned viaActivate.ps1— see dev/README.md.import gdxpdsworks with no binding installed. Thegdxcc.GMS_*type codes are hardcoded in theGamsDataType/GamsVariableType/GamsEquationType/GamsValueTypeenums (withtests/test_imports.py::test_gms_constants_match_gdxccverifying them against the live binding when present), and the bindings load on the first GDX op. Sogdxpds info/gdxpds testcan diagnose the "no bindings installed" environment.
If tests fail with "cannot load gdxcc" or "no _gdxcc module," it's a GAMS environment problem (missing GAMS_DIR, missing bindings, or version skew between gamsapi and the GAMS install), not a code bug.
PowerShell on Windows. Always activate the venv first:
.venv\Scripts\Activate.ps1Install for development:
pip install -e .[test]Run the full test suite:
pytest testsRun a single test file or test:
pytest tests/test_read.py
pytest tests/test_read.py::test_nameKeep test output files after a run (useful when debugging round-trip failures):
pytest tests --no-clean-upThe custom --no-clean-up flag and the shared test fixtures (base_dir, run_dir, manage_rundir, roundtrip_one_gdx) are defined in tests/conftest.py.
The installed gdxpds CLI exposes three subcommands:
gdxpds --version # terse version line
gdxpds info # environment report (Python, bindings, GAMS_DIR + source, load status)
gdxpds test # end-to-end install verification against the local GAMSgdxpds info is also the Python function gdxpds.info() — it returns the report as a str and is contracted to never raise.
Verify a fresh install end-to-end (intended for end users; ships with the base package, no [test] extra needed):
gdxpds testSource lives in src/gdxpds/cli/main.py; the embedded
sample GDX is at src/gdxpds/_verify_install/sample.gdx, regenerable via
dev/build_verify_install_sample.py.
Build the docs (Sphinx, MyST-flavored markdown sources):
pip install -e .[docs] # or .[dev] for tests + docs
cd doc
.\make.bat htmlOutput is in doc/build/html/. Hand-authored docs are .md (parsed by MyST). The API page is generated automatically by sphinx.ext.autosummary with :recursive: — see doc/source/api.md and the templates in doc/source/_templates/autosummary/. Full release / docs publish workflow — GitHub Actions on Release-published events — is in dev/README.md.
Things that aren't obvious from one file:
- Lazy loading.
GdxFile(aMutableSequenceofGdxSymbol) defaults tolazy_load=True. Symbol data is only pulled from the GDX file when.dataframeis accessed. Iterating symbol metadata is cheap; touching dataframes is not. - Symbol kinds drive column shape.
GamsDataType(src/gdxpds/gdx.py) — Set, Parameter, Variable, Equation, Alias. Variables and Equations get five value columns (Level, Marginal, Lower, Upper, Scale); Parameters and Sets get a singleValuecolumn. Write code in src/gdxpds/write_gdx.py infers the type from DataFrame shape and naming. - Special values. GAMS encodes NA/EPS/+Inf/-Inf/UNDEF as fixed magic floats (e.g. 1E300, 2E300, 3E300). src/gdxpds/special.py converts these to/from numpy equivalents (
np.nan,np.inf, andNonefor UNDEF) on read/write. Parameters get this conversion; Sets/Aliases do not (their value column is text, see below). UNDEF (None) is preserved on write by both engines; the gams.transfer write passeseps_to_zero=Falseso EPS survives too. - Set value = element text; membership = row presence. A Set/Alias has one
Valuecolumn holding the GAMS element text (a string;""= a member with no text). Every row is a member — there is no boolean._fixup_set_value(src/gdxpds/gdx.py) normalizes the column to text on assignment (abool/c_bool/missing value →""), so a Set can be built from dims alone, from booleans, or from text. The read path always fetches text (gdxGetElemText()on gdxcc; the records frame on gams.transfer); there is noload_set_textflag. - Aliases. An Alias reads like the Set it aliases and records the parent in
GdxSymbol.alias_of(a parent ref, orNone). It carries no records of its own:alias.dataframereturns the parent'sdataframedirectly (a live view; mutating the parent shows through the alias), and direct assignment toalias.dataframeraises (it's not a mutable slot). Read paths just flip_loaded; write paths emit only the alias header (gdxAddAlias/gt.Alias). The parent must precede it (no relaxed fallback —DomainErrorotherwise).to_gdx(aliases={alias: parent})andgdxpds.gdx.append_alias()build them; ordering followsreorder_for_strict_domains(), which now adds the alias→parent edge. The parent is typically a Set, but an Alias is accepted too: GDX supports chained aliases, and the gdxcc engine preserves the chain on disk (aat -> at -> t) while gams.transfer flattens it to the root (aat -> t). On read both engines produce a same-filealias_ofref. Universe aliases (alias of*) are a documented edge: they read without error (alias_ofresolves touniversal_set) and round-trip within one engine, but the engines disagree on membership (gdxcc includes the*element, gams.transfer doesn't), souniverse_alias_fixture.gdxis excluded from the cross-engine parity glob and covered bytests/test_alias.py. - Lazy + idempotent GAMS bind.
load_gdxcc()in src/gdxpds/tools.py binds the GAMS library and populatesgdxpds.specialdicts on the first GDX op (called by the gdxcc engine's__init__before it creates a handle, and byinfo()inside try/except for diagnostics). Process state:tools._bindings_source,tools._loaded_gams_dir. gams_dir=on the first GDX op selects the bound install. Once loaded, subsequentgdxCreateD(H, <dir>, ...)calls are no-ops against the bound library regardless of<dir>;load_gdxcc()warns when a caller passes agams_dirthat differs from_loaded_gams_dir. One GAMS library per process — multi-version testing is one-venv-per-GAMS. In-process swap is feasible viagdxLibraryUnload()but unimplemented (design notes tracked in a GitHub issue).- GDX handle lifecycle (SWIG-bound
gdxcc; gdxcc engine only — the gams.transfer engine holds no handle). The fullnew_gdxHandle_tp→gdxCreateD→gdxFree→delete_gdxHandle_tpsequence lives in one place: the_GdxHandleRAII class in src/gdxpds/tools.py, used by all three create sites (load_gdxccandload_specialsviawith;GdxccEngine.__init__keeps the instance). It encodes two SWIG hazards so callers don't have to:gdxFree(H)is unsafe on a failed-create handle — it dispatches throughXFree, bound only on a successfulgdxCreateD, so freeing after failure segfaults._GdxHandlefrees+deletes on success but on failure deletes only (the wrapper is a plaincalloc/free, always safe) and never callsgdxFree; the create is validated by_check_gdx_create_rc(raisesGamsLoadError).gdxFreeis also unsafe to call twice (doubleXFree+objectCountunderflow), so_GdxHandle.close()is run-once/idempotent and everynew_gdxHandle_tp()is paired with exactly onedelete_gdxHandle_tp(). The gdxcc engine owns its handle for its lifetime;GdxFileschedulesweakref.finalize(self, self._engine_impl.close)— fired at the first ofcleanup()/__exit__, garbage collection (it sits in a cycle viauniversal_set, so cyclic GC reclaims it), or interpreter exit. No class frees from__del__(which would run at teardown after module state is partially gone); the engine'sclose()binds its gdxcc callables at construction so it stays valid at shutdown. The legacyto_dataframes/to_gdxTranslators callGdxFile.cleanup()(not the removed__del__). Regression coverage: tests/test_handle_lifecycle.py.
- Code style & typing: ruff (lint + format;
[tool.ruff]in pyproject.toml) and pyright (basic mode;[tool.pyright]). Runruff check --fix,ruff format, andpyrightbefore pushing, or install the local hooks withpre-commit install(.pre-commit-config.yaml). Conventions: only the public API is annotated (the SWIG-bound internals stay untyped, so pyright's None-safety categories are downgraded to warnings — see[tool.pyright]);E501is delegated to the formatter; the ruff ruleset is intentionally conservative (no bugbear/docstring rules yet). The one-time bulk reformat is recorded in .git-blame-ignore-revs (git config blame.ignoreRevsFile .git-blame-ignore-revs). - CI — .github/workflows/: lint.yml runs ruff + pyright on PRs and
main(GAMS-free; the only automated code check). The rest are docs/release: build + deploy Sphinx (PR check + main → /latest/ + per-tag /vX.Y.Z/) and publish to PyPI on Release. Tests still run locally before release, per dev/README.md; there is no test CI. - Test fixtures include real
.gdxand.csvfiles in tests/. These live outside the package and are not shipped in the wheel —pytest testsruns against a clone of the repo. Don't delete them. - The two CLI scripts live in src/gdxpds/cli/ and are installed via
[project.scripts]in pyproject.toml ascsv_to_gdxandgdx_to_csv. Tests subprocess these names directly (e.g.subprocess.run(["csv_to_gdx", ...], check=True)), which exercises the installed entry points and keeps each round-trip in its own process. - Laboratory rename: this project moved from NREL to NLR in late 2025. New docs/copyright should say NLR / Alliance for Energy Innovation; historical attributions stay as NREL. The repo's GitHub org is
NatLabRockies.