v3.0.0: default to gams.transfer, element-text Sets, full alias support (breaking)#112
Conversation
…ses (breaking)
Coordinated breaking release completing the modernization arc:
- Default I/O engine flips to gams.transfer when usable, falling back to gdxcc;
pin with backend="gdxcc" / GDXPDS_BACKEND.
- Set/Alias value column is always GAMS element text ("" = member, no text);
membership is row presence. load_set_text is removed (text always read/written).
- Full alias support: GdxSymbol.aliased_with, round-trip on both backends,
to_gdx(aliases=) and gdxpds.gdx.append_alias(); no relaxed fallback (DomainError).
- GDX UNDEF (None) preserved on write by both backends (was collapsing to 0.0);
gams.transfer write now passes eps_to_zero=False so EPS round-trips too.
- GdxFile.H removed; raw handle via gdx_file._backend_impl.handle.
- import gdxpds no longer requires a GAMS binding (GMS_* constants hardcoded,
bindings deferred to first GDX op); enables no-bindings diagnostics.
- New TransferError for gams.transfer I/O failures (Error subclass, like GdxError).
Docs/tests updated; dev/ROADMAP.md removed (arc complete, see CHANGES.txt).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…w findings - README, both CLI --backend help strings, and the dev fixture docstrings no longer claim gdxcc is the default or that aliases/set-text can't be written. - overview.md: document that gams.transfer doesn't expose num_records / version / producer before records are loaded (gdxcc does). - Drop the now no-op _normalize hooks in the backend tests; compare DataFrames directly (Set/Alias values are strings, others floats). - gdxcc write_symbol: aliases return early, so the Set-text/fixup branches are Set-only (was a misleading (Set, Alias) check). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- gams.transfer read no longer crashes on a universe alias (alias of '*'): its records have no element_text column, so members get empty text. Both engines now read it as an Alias with aliased_with resolved to universal_set, and it round-trips within a single engine. - The engines legitimately disagree on a universe alias's *membership* (gdxcc includes the '*' element, gams.transfer doesn't), so the new universe_alias_fixture.gdx is excluded from the cross-backend parity glob and covered by a dedicated test instead. - docs: expand the migration section to cover the 2.0.0 breaking changes too (most upgraders come from 1.5.x); add notes on the universe-alias divergence and on mixed strict+wildcard domains (gdxcc writes regular, gams.transfer relaxes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document how GAMS NA/UNDEF/Inf/EPS map to numpy/None and that both engines preserve them on write, with a tip showing how to drop EPS explicitly. Placed under Backend Classes alongside "Set membership and element text" (value representation), not in the engine/config section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR delivers the coordinated v3.0.0 breaking release of gdxpds, completing the backend modernization: defaulting I/O to gams.transfer when usable, standardizing Set/Alias value semantics to element text, adding full alias support, preserving UNDEF on write, and making import gdxpds succeed without GAMS bindings (bindings load on first GDX operation).
Changes:
- Flip default backend selection to prefer
gams.transferwith a transparent fallback togdxcc, and introduceTransferErrorfor transfer I/O failures. - Redefine Set/Alias “Value” to be element text strings (membership by row presence), remove
load_set_text, and preserve GDX UNDEF (None) on write across both engines. - Add full Alias support end-to-end (
aliased_with,append_alias,to_gdx(aliases=...)), plus expanded parity/edge-case tests and documentation updates.
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_write.py | Updates Set write/read assertions to element-text string semantics; removes c_bool handling. |
| tests/test_specials.py | Pins gdxcc for raw-handle special-value fixture writing; updates for GdxFile.H removal. |
| tests/test_read.py | Removes load_set_text coverage; updates Set/Alias expectations to element text and broadens error type assertion. |
| tests/test_imports.py | New tests verifying import gdxpds doesn’t pull bindings and enum constants match live gdxcc when available. |
| tests/test_handle_lifecycle.py | Pins gdxcc where handle lifecycle is asserted; adds explicit test ensuring GdxFile.H is removed. |
| tests/test_domain.py | Pins gdxcc to lock in mixed strict+wildcard domain behavior for that backend. |
| tests/test_backend_selection.py | Updates backend default expectations; adds TransferError/DomainError assertions and transfer bad-read test. |
| tests/test_backend_parity.py | Expands parity to include aliases, set text, and UNDEF preservation; excludes universe-alias fixture from strict parity. |
| tests/test_alias.py | New tests for aliased_with, append_alias, to_gdx(aliases=...), and universe-alias round-trips/edge cases. |
| tests/conftest.py | Pins gdxcc in roundtrip fixture where pre-load metadata is required. |
| src/gdxpds/write_gdx.py | Adds aliases= support to to_gdx via late materialization of alias symbols. |
| src/gdxpds/special.py | Moves gdxcc import into load_specials() to enable binding-free package import. |
| src/gdxpds/read_gdx.py | Removes load_set_text parameter plumbing; documents Set/Alias Value semantics. |
| src/gdxpds/gdx.py | Major model/API updates: hardcoded GMS constants for binding-free import, Set/Alias value=text, alias support (aliased_with), GdxFile.H removal. |
| src/gdxpds/cli/gdx_to_csv.py | Updates CLI help text to reflect new backend default behavior. |
| src/gdxpds/cli/csv_to_gdx.py | Updates CLI help text to reflect new backend default behavior. |
| src/gdxpds/_transfer_backend.py | Implements alias read/write + element text support; preserves UNDEF and prevents EPS-to-zero on write; raises TransferError. |
| src/gdxpds/_gdxcc_backend.py | Adds alias metadata handling and alias write; writes Set element text and preserves UNDEF. |
| src/gdxpds/_backend.py | Sets default backend preference to transfer with fallback resolution; simplifies load APIs after load_set_text removal. |
| src/gdxpds/init.py | Bumps version to 3.0.0 and exports new public exceptions. |
| README.md | Updates backend-default documentation/snippet. |
| doc/source/overview.md | Adds/updates docs for aliases, set membership & element text, special values, and migration guidance. |
| dev/ROADMAP.md | Removes roadmap (now superseded by CHANGES). |
| dev/build_universe_alias_fixture.py | New script to generate universe-alias fixture via raw gdxcc. |
| dev/build_set_text_fixture.py | Updates fixture builder to use backend handle escape hatch (no GdxFile.H). |
| dev/build_alias_fixture.py | Updates alias fixture builder to use backend handle escape hatch (no GdxFile.H). |
| CLAUDE.md | Updates contributor guidance for lazy binding import, alias semantics, and new defaults. |
| CHANGES.txt | Adds v3.0.0 release notes documenting breaking changes and new features. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
UNDEF (None) was collapsing back to NaN under pandas 2.x because the
default `replace` silently downcasts None into a float column. Wrap the
value-column replace in `pd.option_context('future.no_silent_downcasting',
True)` so object dtype is preserved when None remains and untouched
columns stay float. Restores test_read_parity_undef / test_write_parity_undef
on GAMS 49/51 (pandas 2.2.x/2.3.x); GAMS 34 (pandas 3.x) was already OK.
PR review follow-ups:
- `GdxSymbol.aliased_with` setter now fails fast: rejects assignment on a
non-Alias symbol and a parent whose data_type is not Set or Alias.
- `append_alias` validates that `alias_name` is a str; `Translator.__add_aliases`
validates the alias-name key type too.
- `GdxSymbol.clone()` no longer carries the live `aliased_with` ref into the
clone (which would still point at the original file's parent); only the
parent name is preserved, and the destination file resolves it at write time.
- Document that chained aliases (alias-of-alias) are accepted: gdxcc preserves
the chain on disk, gams_transfer flattens it to the root Set; both read back
to a same-file ref. CLAUDE.md and overview.md updated; tests cover both backends.
`run_test_matrix.sh` now also asserts `gdxpds info` succeeds in every venv,
including `.venv-no-gams` (since v3.0.0 made the import binding-free).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BREAKING (within v3.0.0 — engine selection is new this week): rename the
'backend' concept to 'engine' everywhere — the CLI flag `--backend` is now
`--engine`, the env var `GDXPDS_BACKEND` is `GDXPDS_ENGINE`, the `backend=`
kwarg on every read/write entry point is `engine=`, the `Backend` enum is
`Engine`, `BackendError` is `EngineError`, and the implementation files
`_backend.py` / `_gdxcc_backend.py` / `_transfer_backend.py` are
`_engine.py` / `_gdxcc_engine.py` / `_transfer_engine.py` (tests likewise).
Read-side API: new `gdxpds.get_aliases(gdx_file)` returns the alias
relationships as `{alias_name: parent_name}`, parallel to
`get_subset_relationships()` and shape-compatible with the `aliases=`
argument of `to_gdx()`. Tests added.
Code cleanups (from PR review):
- `resolve_engine()` calls `_probe_gams_transfer` exactly once, after
the engine is resolved.
- `_gdxcc_engine` Set/Alias read comment now distinguishes gdxpds's
string-valued surface from gdxcc's on-disk index-into-element-text
storage (which `gdxGetElemText` resolves before assignment).
- `load_specials` docstring spells out why the function uses the gdxcc
bindings regardless of which engine the caller selected.
Strengthened alias-chain coverage: `test_alias_of_alias_roundtrip_both_engines`
now asserts the *specific* per-engine `aliased_with_name` (gdxcc: 'at';
gams_transfer: 't'), and a new `test_alias_chain_disk_shape_differs_between_engines`
reads the raw on-disk GDX through gdxcc to prove the difference at the byte
level.
overview.md overhaul (from PR review):
- Direct Conversion preamble updated for v3.0.0: Set values are element
text strings (with `""`-empty membership-by-row-presence), not booleans;
Aliases are mentioned and round-tripped via `aliases=` / `get_aliases()`.
- "Backend Classes" section renamed to "Object-Oriented API" (the old
name now collides with the engine concept) and reorganized into
hierarchical sub-sections: `### Set details` (membership/text, subset
relationships, aliases) and `### Parameter, Variable, and Equation
details` (variable/equation types with enum links, special values).
- Extended the OO example to cover element text, subsets, aliases, and
GDX UNDEF in one walk-through.
- Migration section rewritten: explicit `### v2.0.0` and `### v3.0.0`
headings, v2.0.0 listed first.
- Configuration: OS-tabbed examples (POSIX / Windows CMD / PowerShell)
via the new `sphinx-design` dep; cross-link to Install -> Preliminaries;
engine-paragraph rewrite with the parity caveat made explicit; the
`version`/`producer` and per-symbol `num_records` differences split
into two clearly-labeled bullets.
`myst_heading_anchors` bumped from 3 to 4 so the new h4 anchors resolve.
Test matrix on this branch: PASS / PASS / PASS / OK across
.venv-gams-{34,49,51} and .venv-no-gams. Ruff and pyright clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ispatch
BREAKING (within v3.0.0 — alias support is new in this PR): rename the alias
relationship attribute on GdxSymbol from `aliased_with` to `alias_of`. The
old name was a gdxpds invention; gams.transfer uses `alias_with` (active
voice; not `aliased_with`) and gdxcc has no Python-attribute analog. `alias_of`
reads naturally ("at is `alias_of` t"), matches the active voice of the
gams.transfer naming, and isn't confused with the unrelated "aliased with"
phrasing that suggested help-from a parent. Renames propagated:
`GdxSymbol.alias_of` / `alias_of_name`, `resolve_alias_of()`, the
`alias_of=` constructor kwarg, and the corresponding private attrs.
Also adds a comment in `_gdxcc_engine.GdxccEngine.write_symbol` cross-
referencing the gams.transfer engine's `gt.Alias` / `gt.UniverseAlias`
dispatch: `gdxAddAlias` accepts `"*"` as a parent without any special-case
call, so the gdxcc engine has a single uniform write path for both
named-Set aliases and universe aliases.
Test matrix on this branch: PASS / PASS / PASS / OK across
.venv-gams-{34,49,51} and .venv-no-gams. Ruff and pyright clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI pre-commit caught one line in dev/ that I missed locally -- my matrix script's lint pass was scoped to src/ and tests/, but pre-commit covers the whole repo. No semantic change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…le tree dev/run_test_matrix.sh previously only ran ruff (manually, not even via the script) against src/ and tests/, so the dev/ formatting issue from bcca44f slipped through to CI. Add an up-front `run_lint` phase that runs `ruff check .` and `ruff format --check .` across the whole repo -- matching the scope of the .pre-commit-config.yaml hooks that CI invokes -- and surface its verdict in the summary alongside the per-venv lines. Implementation notes: - Resolves a ruff binary from .venv-no-gams/bin/ruff or PATH; if none found, the lint phase is SKIPPED with a clear message rather than failing (so a fresh checkout without lint deps still runs the GAMS matrix). - Uses --check modes so the script asserts the tree is already clean rather than mutating files. - Writes its own log at dev/test_matrix_logs/lint.log. - Runs once up front, not per-venv (lint is environment-independent). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ne` in tests - GdxSymbol.clone docstring now states the result is independent of any GdxFile (no file, no index), explains the append + resolve_alias_of() wiring for aliases, and points at GdxFile.clone for the wired version. - Renamed leftover `be` loop variables in tests/test_alias.py to `engine` (residue from the backend -> engine rename).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 40 out of 41 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
tests/test_engine_timing.py:8
- Typo in the module docstring: "a engine" should be "an engine".
An Alias has no records of its own -- GDX itself stores only the alias->parent pointer. Previously, both engines populated each alias's own ``GdxSymbol.dataframe`` on read (a copy of the parent's records), and write paths silently ignored any divergent mutations. The "silent" part was the footgun: a caller who set ``alias.dataframe = X`` saw the parent's records on disk, not X. New contract: - ``alias.dataframe`` returns the parent's ``dataframe`` directly (a live view, not a copy). Mutating the parent shows through the alias. - Direct assignment to ``alias.dataframe`` raises ``Error``. - ``unload()`` / ``clone()`` skip the dataframe slot for aliases. - ``load()`` on an alias also loads its parent; engines collaborate via a new ``GdxEngine._expand_alias_targets`` helper that walks chained aliases (replaces the duplicated immediate-parent pull that the transfer engine already had). - Both engines' read paths now short-circuit aliases to ``_loaded=True`` without touching records. Docs in CLAUDE.md and overview.md spell out the view semantics. New tests in test_alias.py cover: view identity in memory and after read (both engines), mutation propagation, direct-assign raise, unload, num_records tracking, lazy-load parent pull, and clone+resolve into a destination file picking up the destination's parent. 138->155 passed; ruff + pyright clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two PR-review findings (PR #112): 1. `[build-system].build-backend` in pyproject.toml was renamed to `build-engine` by the backend->engine rename in bcca44f. That's a PEP 517 key, not a project term -- with `build-engine`, pip/`python -m build`/uv fall back to legacy semantics and silently ignore the pin. Reverted to `build-backend`. The rename audit shows pyproject was the only collateral; nothing else affected. 2. `resolve_engine()` probed gams.transfer usability with no `gams_dir`, so the cached default-discovered install drove engine selection regardless of an explicit `gams_dir=` on `GdxFile` / `to_dataframes` / `to_gdx` / `info()`. That can pick `gams_transfer` when the caller's dir doesn't support it (then fail later), or silently fall back to `gdxcc` (or raise on an explicit transfer request) when the caller's dir actually does support it. Fix: `resolve_engine` now accepts `gams_dir`; `GdxFile.__init__` passes `self.gams_dir` (already resolved by NeedsGamsDir) and `info()` passes its `gams_dir` arg. `_probe_gams_transfer(gams_dir)` already runs a fresh probe and skips the cache for explicit dirs. Coverage: - `dev/check_resolve_engine_cross_gams.py` -- behavioral check: for each `gams_dir` arg (and a `/tmp` control that's guaranteed non-GAMS), assert `resolve_engine(None, gams_dir=X)` matches `_probe_gams_transfer(X)`. Without the fix the control case would return `gams_transfer` against `/tmp` (using the cached default probe); with it, returns `gdxcc`. - `dev/run_test_matrix.sh` -- discovers each `.venv-gams-*`'s GAMS_DIR by sourcing its activate in a subshell, then per venv invokes the check with the other venvs' real GAMS dirs (folded into the verdict). 155 passed; ruff + pyright clean; `pip wheel` builds cleanly with the restored `build-backend`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 41 out of 42 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
tests/test_engine_timing.py:8
- Minor grammar in the module docstring: "a engine" should be "an engine".
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The final, coordinated breaking release of the modernization arc. Closes out the work tracked in (the now-removed)
dev/ROADMAP.md; everything shipped is recorded inCHANGES.txt.Breaking changes
gams.transferwhen usable, falling back togdxcc. Pin the old engine withbackend="gdxcc"/GDXPDS_BACKEND=gdxcc.""= member with no text); membership is row presence.load_set_textis removed — text is always read, and any non-empty text is always written.None) is preserved on write by both engines (was collapsing to0.0).GdxFile.Hremoved — usegdx_file._backend_impl.handlefor raw gdxcc access.New
GdxSymbol.aliased_with, round-trip on both engines,to_gdx(aliases={alias: parent}), andgdxpds.gdx.append_alias(). No relaxed fallback — an unknown/non-Set parent raisesDomainError.import gdxpdsno longer requires a GAMS binding: theGMS_*constants are hardcoded (a test verifies them against the live binding) and bindings load on the first GDX op, sogdxpds info/gdxpds testcan diagnose a no-bindings environment.TransferErrorfor gams.transfer I/O failures (anErrorsubclass, parallel toGdxError).Bugs found & fixed along the way
Container.writedefaultseps_to_zero=True) — masked by the parity tests' float tolerance; now writes witheps_to_zero=False.element_textcolumn) — both fixed.Edge cases documented (not cross-engine identical)
alias of '*'): read without error and round-trip within one engine, but the engines disagree on membership (gdxcc includes the*element). Excluded from the parity glob; covered by a dedicated test.[Set, '*']): written regular by gdxcc, relaxed by gams.transfer.Docs & tests
--backendhelp updated;dev/ROADMAP.mddeleted.test_imports.pyandtest_alias.py; rewritten set/alias/UNDEF/parity coverage. 138 passed; ruff + pyright clean.Note
This is a GAMS-dependent package with no test CI; the full suite was run locally against GAMS 53. One verification — that
gdxpds testruns in a binding-free environment — is best confirmed on Linux and is left to the maintainer.🤖 Generated with Claude Code