Skip to content

Changes from Recent Experiments#98

Merged
martintb merged 11 commits intomainfrom
2603_ORNL
Mar 22, 2026
Merged

Changes from Recent Experiments#98
martintb merged 11 commits intomainfrom
2603_ORNL

Conversation

@martintb
Copy link
Copy Markdown
Collaborator

This pull request introduces several improvements and bug fixes across the Double Agent codebase, focusing on more robust handling of sample dimensions in pipelines, improved compatibility with Tiled entry IDs, and enhanced data type safety for xarray datasets. The changes also include new and expanded tests to ensure correct behavior, especially for edge cases and legacy data.

Key changes include:

Pipeline and Data Handling Improvements

  • Refactored TreePipeline operations to robustly resolve and transpose sample dimensions, ensuring backward compatibility with legacy pipelines and consistent axis handling. Helper functions _resolve_sample_dim, _transpose_sample_first, and _sample_coords were introduced to encapsulate this logic, and all pipeline operations now use these helpers for dimension handling. (AFL/double_agent/TreePipeline.py) [1] [2] [3] [4] [5] [6]
  • Added _materialize_input_dataset and _sanitize_object_dtypes_for_chunking methods to eagerly load Tiled-backed lazy arrays and convert object-typed variables/coordinates to fixed-width strings, preventing downstream issues with boolean indexing and chunking in xarray. (AFL/double_agent/AgentWebAppMixin.py, AFL/double_agent/AgentDriver.py) [1] [2]
  • Updated the input assembly and prediction pipeline to use these new data sanitation methods, ensuring all datasets are in a safe, predictable state before processing. (AFL/double_agent/AgentWebAppMixin.py, AFL/double_agent/AgentDriver.py) [1] [2]

Tiled Entry ID and Web App Enhancements

  • Improved the test_fetch_entry method to correctly handle both plain and run_documents/-prefixed Tiled entry IDs, using a new _get_tiled_run_document_item hook for normalization and lookup. The returned entry_id is now normalized. (AFL/double_agent/AgentWebAppMixin.py) [1] [2]
  • Updated the input builder web app UI to clarify that both QD-... and run_documents/QD-... entry IDs are accepted. (AFL/double_agent/apps/input_builder/js/main.js)

Testing Improvements

  • Added comprehensive tests for entry ID normalization, fallback lookup logic, and object dtype sanitation in the prediction pipeline, ensuring correct handling of legacy and new data patterns. (tests/test_agentdriver_pipeline_ops.py)
  • Added missing import for IntEncoding in tree pipeline tests. (tests/test_tree_pipeline.py)

Miscellaneous Fixes

  • Fixed a bug in TensorFlowExtrapolator where the output key was incorrectly named entropy instead of variance. (AFL/double_agent/TensorFlowExtrapolator.py)
  • Minor regex fix in Pipeline.py for node label parsing. (AFL/double_agent/Pipeline.py)
  • Improved Preprocessor pipeline op to handle scalar and symbol-less transforms robustly. (AFL/double_agent/Preprocessor.py)

These changes collectively improve robustness, backward compatibility, and user experience across the Double Agent pipeline and web interface.

@martintb martintb merged commit 55d236b into main Mar 22, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant