feat: add options for returning record data and raw responses/requests in execute_stream_test_read #12

aaronsteers · 2025-08-02T02:03:49Z

Simplify execute_stream_test_read to minimal implementation

Summary

This PR dramatically simplifies the execute_stream_test_read function from 172 lines to ~60 lines (66% reduction) by removing complex conditional logic and helper functions. The function now always returns both record data and raw request/response metadata with secrets filtered, eliminating the need for include_records and include_raw_response_data parameters.

Key Changes:

Removed parameters: include_records and include_raw_response_data
Deleted helper functions: _without_records and _without_raw_requests_and_responses
Simplified logic: Always extract and return both records and slices data
Updated tests: Removed mocked unit tests, kept integration tests
Maintained security: Still applies filter_config_secrets to slices

Review & Testing Checklist for Human

⚠️ High Priority Items (5 items):

Verify backward compatibility: Check if any existing code calls execute_stream_test_read with the removed parameters (include_records, include_raw_response_data)
Test edge cases: Try various manifest types, error scenarios, and malformed inputs beyond the Rick and Morty example
Verify security: Confirm that filter_config_secrets properly sanitizes all sensitive data in the slices output
Performance testing: Check if always extracting both records and slices impacts memory usage or response times for large datasets
Real-world validation: Test with actual production connector manifests to ensure the simplified logic handles all real-world scenarios

Recommended Test Plan:

Run the function with various connector manifests (not just Rick and Morty)
Test error scenarios (invalid manifests, network failures, malformed responses)
Verify no secrets appear in the returned slices data
Check performance with large record sets (test with max_records=1000)

Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    Caller["Function Callers"]
    TestRead["execute_stream_test_read<br/>(connector_builder.py)"]:::major-edit
    HelperFuncs["_without_records<br/>_without_raw_requests_and_responses<br/>(DELETED)"]:::major-edit
    StreamResult["StreamTestResult<br/>(model)"]:::minor-edit
    CDK["Airbyte CDK<br/>(read_stream)"]:::context
    Tests["tests/__init__.py<br/>(mocked tests DELETED)"]:::major-edit
    IntegrationTests["tests/test_integration.py<br/>(updated assertions)"]:::minor-edit
    
    Caller -->|"calls (params removed)"| TestRead
    TestRead -->|"DELETED"| HelperFuncs
    TestRead -->|"always returns both<br/>records + slices"| StreamResult
    TestRead -->|"calls"| CDK
    TestRead -->|"applies filter_config_secrets"| StreamResult
    Tests -->|"validates"| TestRead
    IntegrationTests -->|"validates"| TestRead
    
    subgraph Legend
        L1["Major Edit"]:::major-edit
        L2["Minor Edit"]:::minor-edit  
        L3["Context/No Edit"]:::context
    end
    
    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF

Notes

Breaking change: This removes function parameters, so any existing callers will need updates
Simplified approach: The "all or nothing" approach eliminates complex conditional logic but changes the function contract
CI passing: All 63 tests pass including pytest on Python 3.10/3.11, but this doesn't guarantee backward compatibility
Requested by: AJ Steers (@aaronsteers)
Devin session: https://app.devin.ai/sessions/e31e9b3aeee54c569beec1aafa4e25cd

The main risk is that this is a breaking change disguised as a simplification. While the code is cleaner and the integration test passes, we need to verify no existing code depends on the removed functionality.

- Add optional records field to StreamTestResult model - Add summary_only parameter (default True) to maintain backward compatibility - Extract and return actual record data when summary_only=False - Add comprehensive error handling for record data extraction - Add tests for both summary_only modes and error scenarios Co-Authored-By: AJ Steers <[email protected]>

devin-ai-integration · 2025-08-02T02:03:50Z

Original prompt from AJ Steers

@Devin our connector builder mcp server doesn't return record data as of now in the test read tool. Can you see if that can be added? Maybe we can add an option to that tool called "summary only". If that is turned on then we won't return data otherwise we will.

devin-ai-integration · 2025-08-02T02:03:52Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

github-actions · 2025-08-02T02:03:59Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/autofix - Fixes most formatting and linting issues
/poe <command> - Runs any poe command in the uv virtual environment

📝 Edit this welcome message.

Copilot

Pull Request Overview

This PR adds functionality to optionally return actual record data from the execute_stream_test_read MCP tool. Previously, the tool only returned summary metadata. With the new summary_only parameter, users can now choose to receive the actual record data when summary_only=False, while maintaining backward compatibility with the default behavior.

Added optional records field to StreamTestResult model to hold actual record data
Added summary_only parameter to execute_stream_test_read function with default True for backward compatibility
Implemented record extraction logic with fallback handling for different data structures

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
connector_builder_mcp/_connector_builder.py	Added `records` field to `StreamTestResult` model and `summary_only` parameter with record extraction logic
tests/init.py	Added comprehensive test cases for the new functionality including fallback scenarios

connector_builder_mcp/_connector_builder.py

github-actions · 2025-08-02T02:04:48Z

PyTest Results (Fast)

0 tests ±0 0 ✅ ±0 0s ⏱️ ±0s
0 suites ±0 0 💤 ±0
0 files ±0 0 ❌ ±0

Results for commit 083a63c. ± Comparison against base commit 53ee5ff.

♻️ This comment has been updated with latest results.

github-actions · 2025-08-02T02:04:49Z

PyTest Results (Full)

0 tests ±0 0 ✅ ±0 0s ⏱️ ±0s
0 suites ±0 0 💤 ±0
0 files ±0 0 ❌ ±0

Results for commit 083a63c. ± Comparison against base commit 53ee5ff.

♻️ This comment has been updated with latest results.

connector_builder_mcp/_connector_builder.py

@aaronsteers

- Change summary_only parameter default from True to False as requested - Update test to use new default behavior (no explicit parameter) - Rename test function to reflect it's testing the default behavior - All tests pass with new default behavior Addresses GitHub comments from @aaronsteers on PR #12 Co-Authored-By: AJ Steers <[email protected]>

…, include_raw_responses - Replace single summary_only parameter with three separate parameters - include_records: bool (defaults to False) - include_raw_requests, include_raw_responses: bool | None (defaults to None) - On success: None behaves like False - On failure: None behaves like True for debugging - Extract raw HTTP request/response data from CDK slices structure - Apply filter_config_secrets() to sanitize sensitive data in raw requests/responses - Update all tests to use new parameter structure - Add comprehensive tests for new functionality including nullable behavior Co-Authored-By: AJ Steers <[email protected]>

- Fix airbyte_cdk.models.airbyte_protocol imports to use airbyte_cdk.models - Add explicit type annotations for raw_requests_data and raw_responses_data - All local checks now pass: MyPy, pytest, and ruff Co-Authored-By: AJ Steers <[email protected]>

…onse_data - Replace include_raw_requests and include_raw_responses with single include_raw_response_data parameter - Update StreamTestResult model to use slices field instead of separate raw_requests/raw_responses - Attach entire slices array directly from CDK result - Update all tests to use new parameter structure - Maintain secret sanitization and hierarchical data structure Co-Authored-By: AJ Steers <[email protected]>

…ucture - Replace complex nested loops with single filter_config_secrets() calls - Update filter_config_secrets() to recursively handle lists and dictionaries - Eliminate 84 lines of complex iteration code - Maintain same level of secret sanitization security - All 49 tests continue to pass Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

- Change remaining 'if include_raw_response_data:' to 'if include_raw_response_data is not False:' - Ensures consistent nullable boolean behavior throughout the function - Completes implementation of GitHub comment suggestion Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

…le boolean behavior - Add comments explaining CDK TestReader can return data in two formats - Update parameter description to document None behavior on success vs failure - Address GitHub comment about conditional logic rationale Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

Co-Authored-By: AJ Steers <[email protected]>

…ring Co-Authored-By: AJ Steers <[email protected]>

Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

…ub feedback Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

… feedback Co-Authored-By: AJ Steers <[email protected]>

…slices directly - Add _without_records() helper to remove records from slices structure - Add _without_raw_requests_and_responses() helper to remove raw HTTP data - Modify execute_stream_test_read to use helpers and return slices directly - Replace problematic conditional logic with proper record extraction from nested structure - Fix whitespace formatting issues for ruff compliance - All 49 tests passing Co-Authored-By: AJ Steers <[email protected]>

connector_builder_mcp/_connector_builder.py

- Remove helper functions _without_records and _without_raw_requests_and_responses - Remove include_records and include_raw_response_data parameters - Always return both records and slices data with secrets filtered - Reduce implementation from 172 lines to ~60 lines - Remove mocked unit tests, keep integration tests - Verified functionality with Rick and Morty API test Co-Authored-By: AJ Steers <[email protected]>

github-actions · 2025-08-06T18:17:10Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

/autofix - Fixes most formatting and linting issues
/poe <command> - Runs any poe command in the uv virtual environment

📝 Edit this welcome message.

Copilot AI review requested due to automatic review settings August 2, 2025 02:03

github-actions bot added the enhancement New feature or request label Aug 2, 2025

Copilot AI reviewed Aug 2, 2025

View reviewed changes

connector_builder_mcp/_connector_builder.py Outdated Show resolved Hide resolved

connector_builder_mcp/_connector_builder.py Outdated Show resolved Hide resolved

aaronsteers commented Aug 2, 2025

View reviewed changes

connector_builder_mcp/_connector_builder.py Outdated Show resolved Hide resolved

connector_builder_mcp/_connector_builder.py Outdated Show resolved Hide resolved

devin-ai-integration bot and others added 3 commits August 2, 2025 02:09

aaronsteers changed the title ~~feat: add summary_only parameter to execute_stream_test_read~~ feat: add options for returning record data and raw responses/requests in execute_stream_test_read Aug 4, 2025

devin-ai-integration bot and others added 2 commits August 4, 2025 16:24

aaronsteers commented Aug 4, 2025

View reviewed changes