Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add explicit docker volume mapping (resolves permission denied errors); fix: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection; feat: add option to auto-print full log file text, enabled by default in CI environments #637

Merged

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Mar 25, 2025

This is a monster of a PR because we're trying to expedite a fix for permission issues in docker-based execution. Included in this PR:

  • fix: add explicit docker volume mapping (resolves permission denied errors)
  • fix: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection
  • feat: add option to auto-print full log file text, enabled by default in CI environments

Resolves: #348, #638

Summary by CodeRabbit

  • New Features

    • Introduced a new constant to control the printing of full error logs, enhancing debugging capabilities in CI/CD environments.
    • Added a boolean attribute to manage log file information display in exceptions, allowing for more flexible logging behavior.
    • Enhanced logging capabilities in the workflow by uploading logs to GitHub Artifacts for failed test runs.
    • Added a new method for mapping command-line interface arguments in the Executor class, allowing for custom argument processing.
    • Introduced a new optional parameter for mapping local volume paths to container paths in the DockerExecutor class.
  • Refactor

    • Improved error handling for connector manifest fetching by consolidating multiple exceptions into a single custom error, enhancing clarity in reporting issues.
    • Streamlined the processing of connector installations by removing legacy checks for low-code connectors, leading to a more consistent approach.
    • Refined argument handling in the DeclarativeExecutor class by introducing a mapping step before parsing.
    • Enhanced accessibility of temporary files by granting read permissions to all users.
  • Tests

    • Enhanced clarity of test results by refactoring error reporting in the test_expected_hardcoded_failures function, consolidating assertions for unexpected exceptions and those that did not raise exceptions.
    • Updated the skip reason for a test case to provide clearer context regarding the underlying issue.

@aaronsteers aaronsteers self-assigned this Mar 25, 2025
@aaronsteers aaronsteers requested a review from Copilot March 25, 2025 16:13
@aaronsteers aaronsteers marked this pull request as ready for review March 25, 2025 16:13
@aaronsteers aaronsteers changed the title feat: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection fix: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection Mar 25, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes the legacy low-code check used to misclassify manifest-only connectors, as described in issue #348.

  • Removed the declaration of the low-code CDK tag
  • Removed the conditional logic that treated connectors with the low-code tag as manifest-only
Comments suppressed due to low confidence (1)

airbyte/sources/registry.py:219

  • Ensure test coverage is updated to verify that connectors previously relying on the low-code logic are handled correctly now, confirming manifest-only connectors are properly detected.
-            InstallType.YAML if _LOWCODE_CDK_TAG in tags else None,

@aaronsteers aaronsteers enabled auto-merge (squash) March 25, 2025 16:13
Copy link
Contributor

coderabbitai bot commented Mar 25, 2025

📝 Walkthrough

Walkthrough

The changes remove the _LOWCODE_CDK_TAG constant and its associated logic from the connector metadata processing function in airbyte/sources/registry.py. This elimination removes the conditional check that previously detected the "cdk:low-code" tag and added the InstallType.YAML installation type, thereby simplifying how connector metadata is derived.

Changes

File Change Summary
airbyte/sources/registry.py Removed _LOWCODE_CDK_TAG constant and eliminated the conditional check for the "cdk:low-code" tag.
airbyte/_executors/util.py Updated error handling in _try_get_source_manifest to raise AirbyteConnectorInstallationError instead of HTTPError and YAMLError.
tests/integration_tests/test_lowcode_connectors.py Changed exception_type parameter from str to Exception and refactored error reporting logic in test_expected_hardcoded_failures.
airbyte/constants.py Added new constant AIRBYTE_PRINT_FULL_ERROR_LOGS to control error log printing based on environment variable.
airbyte/exceptions.py Introduced print_full_log attribute in PyAirbyteError to manage log file information in exceptions.
.github/workflows/python_pytest.yml Added steps to upload logs to GitHub Artifacts for failed test jobs.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller
    participant Func as _registry_entry_to_connector_metadata
    Caller->>Func: Invoke function with registry entry (includes tags)
    alt Old Behavior
        Note right of Func: Check for "cdk:low-code" and add InstallType.YAML if found
    else New Behavior
        Note right of Func: Process entry without low-code tag check
    end
    Func-->>Caller: Return connector metadata
Loading

Assessment against linked issues

Objective Addressed Explanation
Remove 'cdk:low-code' check once all connectors are migrated to manifest-only. (#348)

Would you like any additional details or further diagram refinements, wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 612d280 and 6470537.

📒 Files selected for processing (1)
  • airbyte/_executors/docker.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • airbyte/_executors/docker.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Validate PR title
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7aea0c and 1fe27d1.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • airbyte/_executors/util.py (2 hunks)
  • airbyte/sources/registry.py (3 hunks)
  • tests/integration_tests/test_lowcode_connectors.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
airbyte/_executors/util.py (1)
airbyte/exceptions.py (1)
  • AirbyteConnectorInstallationError (321-322)
🪛 GitHub Actions: Run Linters
airbyte/sources/registry.py

[error] 106-106: Need type annotation for '_LOWCODE_CONNECTORS_FAILING_VALIDATION' (hint: '_LOWCODE_CONNECTORS_FAILING_VALIDATION: list[] = ...')

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (10)
airbyte/sources/registry.py (4)

108-111: Looks good - removing potential false-positives

The changes to the lists tracking connectors with unexpected errors look good! This appears to be part of the cleanup related to removing the 'cdk:low-code' logic.


113-114: Looks good - appropriate categorization

Thanks for updating this comment to better describe the error type. The new empty list for file not found errors fits well with the overall cleanup.


202-216: Looks good - simplified manifest detection logic

The installation type logic now correctly relies solely on the language-based determination rather than the previously removed 'cdk:low-code' tag. This should eliminate the false-positives mentioned in the PR title.


344-350: Looks good - consistent exclusion list

The function for retrieving YAML connectors still correctly excludes the problematic connectors listed in _LOWCODE_CONNECTORS_EXCLUDED. This maintains compatibility while removing the false-positive detection logic.

airbyte/_executors/util.py (2)

45-46: Improved error documentation

The docstring now accurately reflects that we're raising the more specific AirbyteConnectorInstallationError instead of the more generic HTTP/YAML errors. This is a great improvement!


65-73: Better error handling for manifest downloads

Wrapping the HTTP request in a try/except and providing a more specific error with context is an excellent improvement. This makes debugging connector installation issues much easier.

tests/integration_tests/test_lowcode_connectors.py (4)

59-60: Better typing for exception parameter

Changing the parameter type from str to Exception is more type-safe and allows for proper instance checking with isinstance(). Good improvement!


66-68: Improved error collection

Creating lists to collect exceptions rather than failing immediately allows for more comprehensive error reporting. This makes the test output much more helpful when multiple connectors have issues.


79-82: Better error handling

Collecting errors into the lists rather than raising immediately is a much better approach. It gives developers a complete picture of all failing connectors at once.


83-90: Improved assertion error message

The consolidated error message with details about both no-exception and wrong-exception cases provides much clearer feedback. This makes it easier to update the hardcoded lists when connectors change.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@aaronsteers
Copy link
Contributor Author

aaronsteers commented Mar 25, 2025

/fix-pr

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.
(This job requires that the PR author has "Allow edits from maintainers" enabled.)

PR auto-fix job started... Check job output.

🟦 Job completed successfully (no changes).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f243d66 and a97776c.

📒 Files selected for processing (2)
  • airbyte/constants.py (1 hunks)
  • airbyte/exceptions.py (4 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
airbyte/constants.py (1)
airbyte/logs.py (1)
  • _str_to_bool (28-30)
🪛 GitHub Actions: Run Linters
airbyte/exceptions.py

[error] 137-137: error: Statement is unreachable [unreachable]

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
airbyte/constants.py (1)

113-126: Nice addition for improved debugging!

The new constant helps provide detailed error logs, especially in CI environments where log persistence might be an issue. The default behavior is well thought out - automatically enabling in CI but keeping it disabled elsewhere.

The documentation explains the purpose clearly, so users will know exactly how to use this feature. Well done!

Though, one thought - maybe we should consider updating the documentation to include an example of how users can explicitly enable or disable this feature? wdyt?

airbyte/exceptions.py (3)

47-47: LGTM!

Clean import of the new constant.


73-73: LGTM!

Good addition of the new attribute with a sensible default from the constant.


99-99: LGTM!

Nice catch adding the new attribute to the special properties list.

aaronsteers and others added 3 commits March 25, 2025 12:12
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
airbyte/_executors/util.py (1)

1-298: 💡 Verification agent

🧩 Analysis chain

Fix formatting issues reported by the pipeline.

The pipeline reports that this file needs formatting. Could you run ruff format on this file to fix the style issues?


🏁 Script executed:

#!/bin/bash
# Check which file needs formatting
ruff format --check airbyte/_executors/util.py

Length of output: 127


Action Required: Reformat airbyte/_executors/util.py using ruff format
The pipeline check reports that one file needs reformatting. It appears that running ruff format on this file would update its style. Could you please run:

ruff format airbyte/_executors/util.py

to automatically fix these formatting issues? wdyt?

🧰 Tools
🪛 Ruff (0.8.2)

35-35: Expected 2 blank lines, found 1

Add missing blank line(s)

(E302)

🪛 GitHub Actions: Run Linters

[error] 1-1: Ruff formatting check failed. 1 file would be reformatted. Please run 'ruff format' to fix code style issues in this file.

🧹 Nitpick comments (1)
airbyte/_executors/docker.py (1)

64-103: Excellent implementation of the CLI args mapping for Docker.

This is a robust implementation that:

  1. Checks if each argument is a file path
  2. Maps it to the correct container path if possible
  3. Provides clear logging for debugging

Minor suggestion: The comment on line 86 says "No break reached; a volume was found for this file path" - should this be "No volume was found" instead? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 226e8d4 and 26b5967.

📒 Files selected for processing (4)
  • airbyte/_executors/base.py (2 hunks)
  • airbyte/_executors/declarative.py (2 hunks)
  • airbyte/_executors/docker.py (3 hunks)
  • airbyte/_executors/util.py (6 hunks)
🧰 Additional context used
🧬 Code Definitions (3)
airbyte/_executors/base.py (4)
airbyte/_executors/docker.py (2)
  • map_cli_args (64-103)
  • _cli (60-62)
airbyte/_executors/declarative.py (1)
  • _cli (93-95)
airbyte/_executors/python.py (1)
  • _cli (287-289)
airbyte/_executors/local.py (1)
  • _cli (61-63)
airbyte/_executors/declarative.py (2)
airbyte/_executors/base.py (1)
  • map_cli_args (188-194)
airbyte/_executors/docker.py (1)
  • map_cli_args (64-103)
airbyte/_executors/util.py (2)
airbyte/exceptions.py (1)
  • AirbyteConnectorInstallationError (344-345)
airbyte/_connector_base.py (1)
  • name (81-83)
🪛 GitHub Actions: Run Linters
airbyte/_executors/util.py

[error] 1-1: Ruff formatting check failed. 1 file would be reformatted. Please run 'ruff format' to fix code style issues in this file.

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (11)
airbyte/_executors/base.py (2)

188-194: Good addition of the map_cli_args method to the base class.

This method nicely implements the Open/Closed principle, providing a default implementation that can be overridden by subclasses. The documentation clearly explains the purpose, which will be useful for maintainers.


206-208: Implementation of the mapping in the execute method looks good.

The changes correctly integrate with the new map_cli_args method. The variable naming is clear and the flow is easy to follow.

airbyte/_executors/declarative.py (2)

21-21: Nice addition of the Namespace import.

The import is necessary for type hinting the parsed arguments - good catch.


106-108: Implementation of mapped CLI args in the declarative executor.

The changes properly utilize the map_cli_args method from the base class, maintaining consistency with the other executor implementations. Good job!

airbyte/_executors/docker.py (1)

20-23: Good addition of the volumes parameter.

The parameter is properly initialized with a default empty dictionary, which ensures the class will work correctly even if volumes aren't provided.

airbyte/_executors/util.py (6)

32-33: Good addition of the default container temp directory constant.

Adding this as a named constant improves readability and maintainability. The docstring is helpful too.


46-47: Nice update to the error documentation.

The docstring is now consistent with the actual implementation, which raises AirbyteConnectorInstallationError instead of the raw HTTP and YAML errors.


66-74: Improved error handling for HTTP requests.

This change enhances error handling by catching HTTP errors and re-raising them as more specific AirbyteConnectorInstallationError with better context. This will help users understand what went wrong.


230-232: Nice variable naming for temp directories.

Using clear variable names for both the host and container temp directories makes the code more readable.


236-248: Good refactoring of volume mapping.

Using a dictionary to hold volume mappings and then iterating through it is cleaner than duplicating the volume mounting logic. This also makes it easier to add more volumes in the future if needed.


257-257: Nice integration with the Docker executor.

Passing the volumes dictionary to the Docker executor connects all the pieces together nicely, completing the feature.

@aaronsteers aaronsteers changed the title fix: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection fix: add explicit docker volume mapping (resolves permission denied errors); fix: remove 'cdk:low-code' logic with false-positives for manifest-only connector detection; feat: add option to auto-print full log file text, enabled by default in CI environments Mar 25, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26b5967 and 0f8ed4b.

📒 Files selected for processing (2)
  • airbyte/_executors/docker.py (3 hunks)
  • airbyte/_executors/util.py (6 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run Linters
airbyte/_executors/util.py

[warning] 2-24: I001 [*] Import block is un-sorted or un-formatted

airbyte/_executors/docker.py

[warning] 79-79: SIM222 Use True instead of ... or True


[warning] 86-86: SIM222 Use True instead of ... or True


[warning] 93-93: SIM222 Use True instead of ... or True

🪛 Ruff (0.8.2)
airbyte/_executors/docker.py

79-79: Use True instead of ... or True

Replace with True

(SIM222)


86-86: Use True instead of ... or True

Replace with True

(SIM222)


93-93: Use True instead of ... or True

Replace with True

(SIM222)

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (10)
airbyte/_executors/docker.py (4)

5-5: Path imported to support volume mapping functionality

The Path class is needed for path operations in the new volume mapping functionality. Good addition!


10-10: Logger import for new CLI argument mapping functionality

Importing the global file logger to support logging in the new map_cli_args method. Makes sense!


13-14: Good addition of Docker container temp directory constant

Defining this constant improves code maintainability by centralizing the default temp directory path used in Airbyte connector Docker images.


24-24: Volume mapping parameter added to resolve permission issues

This change adds support for explicit Docker volume mapping, which should resolve the permission denied errors mentioned in the PR objectives. The implementation correctly initializes with an empty dictionary when no volumes are provided.

Also applies to: 27-27

airbyte/_executors/util.py (6)

14-14: Import for Docker container temp directory constant

Good import of the constant from the docker module. However, the linter is flagging this line with "I001 [*] Import block is un-sorted or un-formatted" - would it make sense to fix the import ordering here?


45-46: Updated docstring for exception handling

The docstring now correctly reflects the updated exception handling. Nice work on improving the documentation!


65-73: Improved error handling with custom exception

This change improves error handling by catching HTTPError and raising a more specific AirbyteConnectorInstallationError with a helpful message and context. This aligns with the rest of the codebase's error handling patterns.


229-238: Refactored volume mapping with dictionary structure

Using a dictionary for volume mappings improves code clarity and makes it easier to extend if needed. Good use of the newly imported constant DEFAULT_AIRBYTE_CONTAINER_TEMP_DIR as well.


245-246: Cleaner Docker command construction

Nice simplification of the Docker command construction using a loop over the volume mappings. This makes the code more maintainable and easier to understand.


256-256: Passing volumes to DockerExecutor

This completes the implementation by passing the volume mappings to the DockerExecutor, enabling the new path mapping functionality. This change directly addresses the permission denied errors mentioned in the PR objectives.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte/_executors/docker.py (1)

102-107: Consider standardizing logging approach

You have a mix of print statements and commented-out logger.debug calls. Would it make sense to standardize on one approach? For example, you could use logger.debug for all messages but fall back to print if logger is None:

-        if logger and args != new_args:
-            print(
-                f"Mapping local-to-container CLI args: {args} -> {new_args} "
-                f"based upon volume definitions: {self.volumes}"
-            )
+        if args != new_args:
+            message = (
+                f"Mapping local-to-container CLI args: {args} -> {new_args} "
+                f"based upon volume definitions: {self.volumes}"
+            )
+            if logger:
+                logger.debug(message)
+            else:
+                print(message)

What do you think? Having a consistent approach might make future maintenance easier.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f8ed4b and b113026.

📒 Files selected for processing (1)
  • airbyte/_executors/docker.py (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
airbyte/_executors/docker.py (1)
airbyte/_executors/base.py (2)
  • Executor (150-236)
  • map_cli_args (188-194)
🪛 Ruff (0.8.2)
airbyte/_executors/docker.py

78-78: Use True instead of ... or True

Replace with True

(SIM222)


85-85: Use True instead of ... or True

Replace with True

(SIM222)


92-92: Use True instead of ... or True

Replace with True

(SIM222)

🪛 GitHub Actions: Run Linters
airbyte/_executors/docker.py

[warning] 78-78: SIM222 Use True instead of ... or True


[warning] 85-85: SIM222 Use True instead of ... or True


[warning] 92-92: SIM222 Use True instead of ... or True

⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
airbyte/_executors/docker.py (3)

24-27: LGTM! New volumes parameter enhances Docker mapping capabilities.

The addition of the volumes parameter in __init__ with proper type hints and default initialization looks good. This will help resolve the permission denied errors mentioned in the PR objectives by explicitly mapping volumes between host and container.


68-108: New method for mapping CLI arguments needs some fixes

The map_cli_args method looks well-structured to handle file path mapping between host and container. However, there are a few issues:

  1. There are if logger or True conditions on lines 78, 85, and 92 that will always evaluate to True, flagged by linters.
  2. The comment on line 92 is incorrect - it says "a volume was found" when we're in the else clause that executes when no volume was found.

Could we fix these issues with something like:

-                        if logger or True:
+                        if logger:
                             # logger.debug(
                             print(
                                 f"Found file input path `{arg}` "
                                 f"relative to container-mapped volume: {local_volume}"
                             )
-                        if logger or True:
+                        if logger:
                             print(f"Mapping `{arg}` -> `{mapped_path}`")
                             # logger.debug(f"Mapping `{arg}` -> `{mapped_path}`")

                 else:
-                    # No break reached; a volume was found for this file path
-                    if logger or True:
+                    # No break reached; no volume was found for this file path
+                    if logger:
                         print(
                             f"File path `{arg}` is not relative to any volume path. "
                             "The file may not be available to the container at runtime."
                         )

Or if you want to keep the debugging print statements regardless of the logger, maybe just remove the conditional check altogether? wdyt?

🧰 Tools
🪛 Ruff (0.8.2)

78-78: Use True instead of ... or True

Replace with True

(SIM222)


85-85: Use True instead of ... or True

Replace with True

(SIM222)


92-92: Use True instead of ... or True

Replace with True

(SIM222)

🪛 GitHub Actions: Run Linters

[warning] 78-78: SIM222 Use True instead of ... or True


[warning] 85-85: SIM222 Use True instead of ... or True


[warning] 92-92: SIM222 Use True instead of ... or True


13-14: The new DEFAULT_AIRBYTE_CONTAINER_TEMP_DIR constant looks good

This constant provides a clear reference to the default temp directory location in Airbyte connector Docker images, which is helpful for standardization. Nice addition!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/python_pytest.yml (2)

68-74: Enhance Log Upload Step in Pytest-fast Job
The added step to upload logs on failure is a solid move for post-mortem debugging. One quick question: have you verified that the /tmp/airbyte/logs/ directory is reliably present in this Ubuntu runner environment? Also, would you consider parameterizing the log path (e.g., via an environment variable) for flexibility in case the runner configuration changes in the future? wdyt?


196-202: Log Upload Step in Pytest Job — Review for Multi-OS Compatibility
The inclusion of a log upload step here is very helpful for diagnosing failures. However, since this job runs on multiple OS configurations (including Windows), is the hard-coded /tmp/airbyte/logs/ path applicable on non-Ubuntu runners? Would you consider parameterizing this path based on the operating system? wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95e63fa and a1fde04.

📒 Files selected for processing (1)
  • .github/workflows/python_pytest.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (1)
.github/workflows/python_pytest.yml (1)

122-128: Good Addition: Log Upload for Pytest-no-creds Job
This conditional log upload step is a nice enhancement for capturing error details when the job fails. Could you please confirm that the /tmp/airbyte/logs/ directory exists in this context, ensuring that logs will indeed be captured? wdyt?

@aaronsteers aaronsteers requested a review from Copilot March 25, 2025 21:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses several issues including fixing permission errors in Docker volume mapping, removing legacy low-code connector checks, and improving logging behavior in CI environments. Key changes include:

  • Adding explicit Docker volume mappings and updating CLI argument mapping in DockerExecutor.
  • Removing low-code connector detection logic from the registry to prevent false-positives.
  • Introducing a new option for printing full error log files and uploading logs to GitHub Artifacts in CI.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
airbyte/_executors/docker.py Adds explicit volume mapping and CLI argument translation for Docker execution.
airbyte/constants.py Introduces a new constant to control full error log printing.
.github/workflows/python_pytest.yml Adds steps to upload logs as artifacts on test failures.
tests/integration_tests/test_lowcode_connectors.py Refactors exception handling in tests for low-code connectors.
airbyte/sources/registry.py Removes legacy low-code connector checks and adjusts registry lists.
airbyte/exceptions.py Enhances error message construction by optionally appending full log text.
airbyte/_executors/util.py Updates Docker volume handling in the connector executor retrieval utility.
airbyte/_util/temp_files.py Grants world-readable permissions on temporary log files.
airbyte/_executors/base.py Applies CLI argument mapping in the base executor’s execution flow.
tests/integration_tests/cloud/test_cloud_sync.py Updates test skip reasons with improved messaging.
airbyte/_executors/declarative.py Adjusts CLI argument mapping in the declarative executor.
Comments suppressed due to low confidence (1)

airbyte/constants.py:113

  • The constant uses os.getenv but the os module is not imported in this file. Add 'import os' at the top to avoid runtime errors.
AIRBYTE_PRINT_FULL_ERROR_LOGS: bool = _str_to_bool(os.getenv(...))

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
airbyte/_executors/docker.py (1)

71-106: Good implementation of map_cli_args, but with a few minor issues

The map_cli_args method looks well-structured and handles file path mapping between host and container. I have a couple of minor suggestions:

  1. Line 90's comment is incorrect - it says "a volume was found" when it should be "no volume was found" since this is the else clause executed when no matching volume is found.

  2. Line 84: The operation container_path / Path(arg).relative_to(local_volume) assumes container_path can be used with the / operator. If container_path is a string, this might cause type issues. Consider ensuring container_path is converted to a Path object first.

  3. The .is_relative_to() method used on line 79 was introduced in Python 3.9, which may create compatibility issues if your package needs to support older Python versions.

Would you consider these changes? wdyt?

-                # No break reached; a volume was found for this file path
+                # No break reached; no volume was found for this file path

Also, for better compatibility and to ensure proper path handling:

-                        mapped_path = container_path / Path(arg).relative_to(local_volume)
+                        mapped_path = Path(container_path) / Path(arg).relative_to(local_volume)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1fde04 and 612d280.

📒 Files selected for processing (1)
  • airbyte/_executors/docker.py (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
airbyte/_executors/docker.py (1)
airbyte/_executors/base.py (2)
  • Executor (150-236)
  • map_cli_args (188-194)
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte/_executors/docker.py (2)

16-17: New constant introduced but not used in this file

I notice you've added a constant DEFAULT_AIRBYTE_CONTAINER_TEMP_DIR but it doesn't appear to be used within this file. Is this intended to be used elsewhere in the codebase, or should it be used somewhere in this implementation?


27-30: LGTM: Nice addition of volume mapping functionality

The addition of the volumes parameter to the constructor with proper typing and default initialization looks good. This implementation allows for mapping local paths to Docker container paths, addressing the permission denied errors mentioned in the PR objectives.

@aaronsteers aaronsteers merged commit 280e172 into main Mar 25, 2025
21 checks passed
@aaronsteers aaronsteers deleted the feat/improve-manifest-only-detection-logic-resolves-348 branch March 25, 2025 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Future To-do: Remove 'cdk:low-code' check once all connectors are migrated to manifest-only.
1 participant