Skip to content

Conversation

aaronsteers
Copy link
Contributor

fix: preserve specific CDK validation errors in validate_manifest

Summary

Fixes overly broad exception handling in the validate_manifest() function that was masking specific CDK validation errors with generic "Validation error" messages.

Before: Users would see unhelpful errors like:

{
  "is_valid": false,
  "errors": ["Validation error: Validation against json schema defined in declarative_component_schema.yaml schema failed"],
  "warnings": [],
  "resolved_manifest": null
}

After: Users now see specific validation details like:

{
  "is_valid": false, 
  "errors": ["JSON schema validation failed: The low-code framework was promoted to Beta in airbyte-cdk version 0.29.0 and contains many breaking changes to the language. The manifest version 0.1.0 is incompatible with the airbyte-cdk package version 6.60.5..."],
  "warnings": [],
  "resolved_manifest": null
}

Changes made:

  • Added jsonschema.ValidationError import
  • Modified exception handling to catch ValidationError specifically before the general Exception catch
  • Maintains backward compatibility for other exception types

Review & Testing Checklist for Human

  • Test invalid manifest validation: Try validating a manifest with schema errors and verify you get specific, actionable error messages instead of generic ones
  • Verify backward compatibility: Ensure existing valid manifests still validate successfully and other error types still return appropriate messages
  • Manual validation workflow: Test the complete validation workflow to ensure the improved error messages help with debugging real manifest issues

Recommended test plan: Create a manifest with known schema violations (e.g., invalid version, missing required fields, invalid field types) and validate it through the MCP tools to confirm error messages are now detailed and helpful.


Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    User["User validates manifest"] --> ValidateManifest["validate_manifest()"]
    ValidateManifest --> CreateSource["create_source()"]
    CreateSource --> CDKValidation["CDK Schema Validation"]
    CDKValidation --> ValidationError["jsonschema.ValidationError"]
    ValidationError --> SpecificHandler["Specific ValidationError handler"]:::major-edit
    CDKValidation --> OtherErrors["Other Exceptions"] 
    OtherErrors --> GenericHandler["Generic Exception handler"]:::context
    SpecificHandler --> DetailedErrors["Detailed error messages"]:::major-edit
    GenericHandler --> GenericErrors["Generic error messages"]:::context
    
    subgraph Legend
        L1["Major Edit"]:::major-edit
        L2["Minor Edit"]:::minor-edit  
        L3["Context/No Edit"]:::context
    end

    classDef major-edit fill:#90EE90
    classDef minor-edit fill:#87CEEB
    classDef context fill:#FFFFFF
Loading

Notes

  • This change directly addresses the user feedback about unhelpful validation error messages
  • The fix is minimal and focused, preserving existing behavior while improving error reporting
  • All existing tests pass (63/63), confirming no regressions
  • Tested with a validation script that confirmed specific error messages are now preserved

Requested by: AJ Steers (@aaronsteers)
Devin session: https://app.devin.ai/sessions/09bc6e5fedd64dc9a958add0f43a868b

- Replace overly broad exception handling with specific ValidationError handling
- Preserve detailed JSON schema validation error messages from CDK
- Maintain backward compatibility for other exception types
- Add jsonschema.ValidationError import to catch schema validation failures

This fixes the issue where generic 'Validation error' messages were masking
helpful CDK validation details, improving the developer experience when
debugging manifest validation failures.

Co-Authored-By: AJ Steers <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings August 6, 2025 00:07
Copy link
Contributor

Original prompt from AJ Steers
@Devin - in the connector builder MCP, we're not returning helpful failure messages when the manifest fails to validate. The error I'm seeing right now is:

{
  "is_valid": false,
  "errors": [
    "Validation error: Validation against json schema defined in declarative_component_schema.yaml schema failed"
  ],
  "warnings": [],
  "resolved_manifest": null
}

But no errors that are helpful. It's possible we have too-wide of a try loop and we're supressing other failures in the code that are not json schema validation issues. Can you look into this?

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

github-actions bot commented Aug 6, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This Branch via MCP

To test the changes in this specific branch with an MCP client like Claude Desktop, use the following configuration:

{
  "mcpServers": {
    "connector-builder-mcp-dev": {
      "command": "uvx",
      "args": ["--from", "git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754351228-improve-validation-errors", "connector-builder-mcp"]
    }
  }
}

Testing This Branch via CLI

You can test this version of the MCP Server using the following CLI snippet:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/connector-builder-mcp.git@devin/1754351228-improve-validation-errors#egg=airbyte-connector-builder-mcp' --help

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poe <command> - Runs any poe command in the uv virtual environment

📝 Edit this welcome message.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes overly broad exception handling in the validate_manifest() function to preserve specific CDK validation errors instead of masking them with generic messages. Users will now see detailed JSON schema validation errors that include specific information about manifest incompatibilities and validation failures.

  • Added specific handling for jsonschema.ValidationError exceptions
  • Preserved detailed error messages from CDK validation instead of generic "Validation error" messages
  • Maintained backward compatibility for other exception types


except ValidationError as e:
logger.error(f"JSON schema validation error: {e}")
errors.append(f"JSON schema validation failed: {str(e)}")
Copy link

Copilot AI Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ValidationError exception handler should be placed before the generic Exception handler but after any other specific exceptions. Consider whether other specific CDK exceptions should also be handled explicitly to provide better error messages.

Suggested change
errors.append(f"JSON schema validation failed: {str(e)}")
errors.append(f"JSON schema validation failed: {str(e)}")
except AirbyteTracedException as e:
logger.error(f"Airbyte CDK traced exception: {e}")
errors.append(f"Airbyte CDK error: {str(e)}")

Copilot uses AI. Check for mistakes.

Copy link

github-actions bot commented Aug 6, 2025

PyTest Results (Fast)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 182c910. ± Comparison against base commit 36252a7.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Aug 6, 2025

PyTest Results (Full)

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 182c910. ± Comparison against base commit 36252a7.

♻️ This comment has been updated with latest results.

devin-ai-integration bot and others added 4 commits August 6, 2025 00:10
- Add jsonschema>=4.0.0 to main dependencies to fix Deptry analysis
- Add types-jsonschema>=4.0.0 to dev dependencies to fix MyPy type checking
- Resolves CI failures for ValidationError import from jsonschema.exceptions

Co-Authored-By: AJ Steers <[email protected]>
- Add pre-validation using declarative_component_schema.yaml from CDK
- Extract detailed error information: field paths, invalid values, schema constraints
- Preserve CDK validation as fallback if schema loading fails
- Format validation errors with specific field paths and constraint details

Tested with Microsoft Lists manifest and invalid test cases showing:
- Field-specific error paths (e.g., 'at field type')
- Invalid values received vs expected constraints
- Schema type and enum validation details
- Significant improvement over generic 'schema validation failed' messages

Co-Authored-By: AJ Steers <[email protected]>
- Move pyyaml from dev to main dependencies (imported in production code)
- Add types-pyyaml to dev dependencies for MyPy type checking
- Fixes MyPy and Deptry CI failures while preserving functionality

Co-Authored-By: AJ Steers <[email protected]>
- Add module-level caching for declarative component schema
- Avoid reloading and parsing YAML schema on every validation call
- Fixes test_performance_multiple_tool_calls CI timeout (16.53s -> 8.97s)
- Preserves detailed validation error functionality

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers aaronsteers merged commit eb0c5dd into main Aug 6, 2025
13 checks passed
@aaronsteers aaronsteers deleted the devin/1754351228-improve-validation-errors branch August 6, 2025 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant