Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Aug 2, 2025

Simplify execute_stream_test_read to minimal implementation

Summary

This PR dramatically simplifies the execute_stream_test_read function from 172 lines to ~60 lines (66% reduction) by removing complex conditional logic and helper functions. The function now always returns both record data and raw request/response metadata with secrets filtered, eliminating the need for include_records and include_raw_response_data parameters.

Key Changes:

  • Removed parameters: include_records and include_raw_response_data
  • Deleted helper functions: _without_records and _without_raw_requests_and_responses
  • Simplified logic: Always extract and return both records and slices data
  • Updated tests: Removed mocked unit tests, kept integration tests
  • Maintained security: Still applies filter_config_secrets to slices

Review & Testing Checklist for Human

⚠️ High Priority Items (5 items):

  • Verify backward compatibility: Check if any existing code calls execute_stream_test_read with the removed parameters (include_records, include_raw_response_data)
  • Test edge cases: Try various manifest types, error scenarios, and malformed inputs beyond the Rick and Morty example
  • Verify security: Confirm that filter_config_secrets properly sanitizes all sensitive data in the slices output
  • Performance testing: Check if always extracting both records and slices impacts memory usage or response times for large datasets
  • Real-world validation: Test with actual production connector manifests to ensure the simplified logic handles all real-world scenarios

Recommended Test Plan:

  1. Run the function with various connector manifests (not just Rick and Morty)
  2. Test error scenarios (invalid manifests, network failures, malformed responses)
  3. Verify no secrets appear in the returned slices data
  4. Check performance with large record sets (test with max_records=1000)

Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    Caller["Function Callers"]
    TestRead["execute_stream_test_read<br/>(connector_builder.py)"]:::major-edit
    HelperFuncs["_without_records<br/>_without_raw_requests_and_responses<br/>(DELETED)"]:::major-edit
    StreamResult["StreamTestResult<br/>(model)"]:::minor-edit
    CDK["Airbyte CDK<br/>(read_stream)"]:::context
    Tests["tests/__init__.py<br/>(mocked tests DELETED)"]:::major-edit
    IntegrationTests["tests/test_integration.py<br/>(updated assertions)"]:::minor-edit
    
    Caller -->|"calls (params removed)"| TestRead
    TestRead -->|"DELETED"| HelperFuncs
    TestRead -->|"always returns both<br/>records + slices"| StreamResult
    TestRead -->|"calls"| CDK
    TestRead -->|"applies filter_config_secrets"| StreamResult
    Tests -->|"validates"| TestRead
    IntegrationTests -->|"validates"| TestRead
    
    subgraph Legend
        L1["Major Edit"]:::major-edit
        L2["Minor Edit"]:::minor-edit  
        L3["Context/No Edit"]:::context
    end
    
    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

  • Breaking change: This removes function parameters, so any existing callers will need updates
  • Simplified approach: The "all or nothing" approach eliminates complex conditional logic but changes the function contract
  • CI passing: All 63 tests pass including pytest on Python 3.10/3.11, but this doesn't guarantee backward compatibility
  • Requested by: AJ Steers (@aaronsteers)
  • Devin session: https://app.devin.ai/sessions/e31e9b3aeee54c569beec1aafa4e25cd

The main risk is that this is a breaking change disguised as a simplification. While the code is cleaner and the integration test passes, we need to verify no existing code depends on the removed functionality.

- Add optional records field to StreamTestResult model
- Add summary_only parameter (default True) to maintain backward compatibility
- Extract and return actual record data when summary_only=False
- Add comprehensive error handling for record data extraction
- Add tests for both summary_only modes and error scenarios

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

Original prompt from AJ Steers
@Devin our connector builder mcp server doesn't return record data as of now in the test read tool. Can you see if that can be added? Maybe we can add an option to that tool called "summary only". If that is turned on then we won't return data otherwise we will.

@Copilot Copilot AI review requested due to automatic review settings August 2, 2025 02:03
Copy link
Contributor

devin-ai-integration bot commented Aug 2, 2025

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

github-actions bot commented Aug 2, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poe <command> - Runs any poe command in the uv virtual environment

📝 Edit this welcome message.

@github-actions github-actions bot added the enhancement New feature or request label Aug 2, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds functionality to optionally return actual record data from the execute_stream_test_read MCP tool. Previously, the tool only returned summary metadata. With the new summary_only parameter, users can now choose to receive the actual record data when summary_only=False, while maintaining backward compatibility with the default behavior.

  • Added optional records field to StreamTestResult model to hold actual record data
  • Added summary_only parameter to execute_stream_test_read function with default True for backward compatibility
  • Implemented record extraction logic with fallback handling for different data structures

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
connector_builder_mcp/_connector_builder.py Added records field to StreamTestResult model and summary_only parameter with record extraction logic
tests/init.py Added comprehensive test cases for the new functionality including fallback scenarios

Copy link

github-actions bot commented Aug 2, 2025

PyTest Results (Fast)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 083a63c. ± Comparison against base commit 53ee5ff.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 2, 2025

PyTest Results (Full)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 083a63c. ± Comparison against base commit 53ee5ff.

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 3 commits August 2, 2025 02:09
- Change summary_only parameter default from True to False as requested
- Update test to use new default behavior (no explicit parameter)
- Rename test function to reflect it's testing the default behavior
- All tests pass with new default behavior

Addresses GitHub comments from @aaronsteers on PR #12

Co-Authored-By: AJ Steers <[email protected]>
…, include_raw_responses

- Replace single summary_only parameter with three separate parameters
- include_records: bool (defaults to False)
- include_raw_requests, include_raw_responses: bool | None (defaults to None)
- On success: None behaves like False
- On failure: None behaves like True for debugging
- Extract raw HTTP request/response data from CDK slices structure
- Apply filter_config_secrets() to sanitize sensitive data in raw requests/responses
- Update all tests to use new parameter structure
- Add comprehensive tests for new functionality including nullable behavior

Co-Authored-By: AJ Steers <[email protected]>
- Fix airbyte_cdk.models.airbyte_protocol imports to use airbyte_cdk.models
- Add explicit type annotations for raw_requests_data and raw_responses_data
- All local checks now pass: MyPy, pytest, and ruff

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers changed the title feat: add summary_only parameter to execute_stream_test_read feat: add options for returning record data and raw responses/requests in execute_stream_test_read Aug 4, 2025
devin-ai-integration bot and others added 2 commits August 4, 2025 16:24
…onse_data

- Replace include_raw_requests and include_raw_responses with single include_raw_response_data parameter
- Update StreamTestResult model to use slices field instead of separate raw_requests/raw_responses
- Attach entire slices array directly from CDK result
- Update all tests to use new parameter structure
- Maintain secret sanitization and hierarchical data structure

Co-Authored-By: AJ Steers <[email protected]>
…ucture

- Replace complex nested loops with single filter_config_secrets() calls
- Update filter_config_secrets() to recursively handle lists and dictionaries
- Eliminate 84 lines of complex iteration code
- Maintain same level of secret sanitization security
- All 49 tests continue to pass

Co-Authored-By: AJ Steers <[email protected]>
aaronsteers and others added 2 commits August 4, 2025 09:33
- Change remaining 'if include_raw_response_data:' to 'if include_raw_response_data is not False:'
- Ensures consistent nullable boolean behavior throughout the function
- Completes implementation of GitHub comment suggestion

Co-Authored-By: AJ Steers <[email protected]>
…le boolean behavior

- Add comments explaining CDK TestReader can return data in two formats
- Update parameter description to document None behavior on success vs failure
- Address GitHub comment about conditional logic rationale

Co-Authored-By: AJ Steers <[email protected]>
devin-ai-integration bot and others added 3 commits August 4, 2025 16:48
…slices directly

- Add _without_records() helper to remove records from slices structure
- Add _without_raw_requests_and_responses() helper to remove raw HTTP data
- Modify execute_stream_test_read to use helpers and return slices directly
- Replace problematic conditional logic with proper record extraction from nested structure
- Fix whitespace formatting issues for ruff compliance
- All 49 tests passing

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers closed this Aug 6, 2025
- Remove helper functions _without_records and _without_raw_requests_and_responses
- Remove include_records and include_raw_response_data parameters
- Always return both records and slices data with secrets filtered
- Reduce implementation from 172 lines to ~60 lines
- Remove mocked unit tests, keep integration tests
- Verified functionality with Rick and Morty API test

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers reopened this Aug 6, 2025
Copy link

github-actions bot commented Aug 6, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754099079-add-summary-only-option#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poe <command> - Runs any poe command in the uv virtual environment

📝 Edit this welcome message.

@aaronsteers aaronsteers merged commit beab29a into main Aug 6, 2025
13 checks passed
@aaronsteers aaronsteers deleted the devin/1754099079-add-summary-only-option branch August 6, 2025 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant