Skip to content

feat: add DataFrame support as RLM context payload#134

Draft
kmad wants to merge 7 commits intoalexzhang13:mainfrom
kmad:feature/rlm-dataframe-support
Draft

feat: add DataFrame support as RLM context payload#134
kmad wants to merge 7 commits intoalexzhang13:mainfrom
kmad:feature/rlm-dataframe-support

Conversation

@kmad
Copy link
Contributor

@kmad kmad commented Mar 10, 2026

Summary

  • Adds native DataFrame support (pandas/polars) as RLM context payloads, enabling structured tabular data to be passed through the recursive decomposition pipeline
  • Lazy-imports dataframe utilities to avoid hard dependencies on pandas/pyarrow
  • Makes pandas/pyarrow install conditional in DockerREPL
  • Handles wide DataFrames by merging duplicated column checks and truncating head/tail

kmad and others added 7 commits March 8, 2026 14:55
Adds pandas DataFrame as a first-class context type for RLM. DataFrames
are serialized via Parquet for type-preserving transfer to all
environments (local, Docker, Modal, Prime, Daytona). Includes metadata
generation for LLM prompts with shape, dtypes, nulls, and sample rows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Frames

Consolidates the two identical `if cols > 20` blocks into one and
applies the same column truncation to head/tail sample rows, preventing
huge metadata strings for wide DataFrames.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Only install pandas and pyarrow in the Docker container when a DataFrame
context is actually passed, avoiding the install overhead for string/dict
contexts. Uses a lazy _ensure_pandas() method called from load_context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the top-level import of dataframe_utils in core/types.py into
QueryMetadata.__init__ so the module is only loaded when actually
constructing query metadata, not on every import of rlm.core.types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…egen

Remove the unified build_context_code() which handled strings, dicts,
and DataFrames with fragile escaping. Keep only build_dataframe_context_code()
for DataFrame-specific Parquet-over-base64 codegen. Restore original
per-environment string/dict handling in Modal, Prime, and Daytona REPLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kmad kmad marked this pull request as draft March 10, 2026 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant