Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Aug 13, 2025

fix: use explicit connector versions from registry instead of 'latest' tags

Summary

This PR addresses an issue where some Java connectors no longer receive 'latest' tag updates in DockerHub, potentially causing PyAirbyte to use outdated connector versions. The solution modifies the Docker executor creation logic to:

  1. Fetch explicit versions from registry: When no version is specified, the system now attempts to retrieve latest_available_version from the connector registry metadata
  2. Use explicit version tags: Instead of defaulting to 'latest', it constructs Docker image tags with explicit version numbers (e.g., airbyte/source-faker:1.2.3)
  3. Graceful fallback with warnings: Falls back to 'latest' with descriptive warning messages when:
    • Registry is unavailable or offline mode is enabled
    • Connector metadata exists but lacks version information

The change maintains full backward compatibility - explicit versions still work as before, and the fallback ensures existing workflows continue to function.

Review & Testing Checklist for Human

  • Test version resolution with real connectors: Verify that common connectors (source-faker, source-postgres, etc.) correctly resolve to explicit versions from the registry instead of 'latest'
  • Validate fallback behavior: Test offline mode (AIRBYTE_OFFLINE_MODE=1) to ensure it properly falls back to 'latest' with appropriate warnings
  • Check warning message frequency: Ensure warning messages aren't too noisy in normal usage - verify that most registered connectors have latest_available_version data
  • Verify explicit version override: Test that specifying an explicit version parameter still works correctly and bypasses registry lookup
  • End-to-end connector execution: Run a complete source operation to ensure the resolved Docker images actually work and can be pulled/executed

Recommended test plan:

  1. Create sources for 3-4 different connectors without specifying versions
  2. Check logs for version resolution messages
  3. Verify Docker images use explicit versions (not 'latest')
  4. Test one connector in offline mode
  5. Test one connector with explicit version specified

Diagram

%%{ init : { "theme" : "default" }}%%
graph TD
    GetConnectorExecutor["get_connector_executor()<br/>util.py:161-385"]:::major-edit
    RegistryMetadata["get_connector_metadata()<br/>registry.py:191-218"]:::context
    Constants["AIRBYTE_OFFLINE_MODE<br/>constants.py"]:::context
    DockerImage["Docker Image Tag<br/>Construction Logic"]:::major-edit
    
    GetConnectorExecutor --> RegistryMetadata
    GetConnectorExecutor --> Constants
    GetConnectorExecutor --> DockerImage
    
    DockerImage --> ExplicitVersion["Use metadata.latest_available_version"]:::major-edit
    DockerImage --> FallbackLatest["Fallback to 'latest' + warning"]:::major-edit
    
    subgraph Legend
        L1[Major Edit]:::major-edit
        L2[Minor Edit]:::minor-edit  
        L3[Context/No Edit]:::context
    end

classDef major-edit fill:#90EE90
classDef minor-edit fill:#87CEEB
classDef context fill:#FFFFFF
Loading

Notes

  • Key change location: The main logic is in airbyte/_executors/util.py lines 283-302, where Docker image tag construction now includes version resolution
  • Logging added: New warning messages help users understand when and why 'latest' fallback is used
  • Registry dependency: Success depends on connector registry having reliable latest_available_version data
  • Testing limitation: Full functionality testing was limited due to environment setup issues, making human testing especially important

Link to Devin run: https://app.devin.ai/sessions/fb2a065dd9a54e5cad797db5a3790ffc
Requested by: @aaronsteers

Summary by CodeRabbit

  • New Features

    • Adds runtime warnings to inform users when the system falls back to a default Docker tag, clarifying potential version risks.
  • Bug Fixes

    • Improves Docker image tag resolution when no tag is provided: uses an explicit version if supplied, prefers the latest available version from metadata if present, and falls back to “latest” (with a warning) if metadata is unavailable.
    • Keeps non-Docker install flows unchanged.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

…' tags

- Modify Docker executor creation to fetch latest_available_version from registry
- Fall back to 'latest' with warning when registry unavailable or offline mode
- Add proper logging for fallback scenarios
- Maintains backward compatibility for all existing use cases

Co-Authored-By: AJ Steers <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings August 13, 2025 21:36
Copy link
Contributor

Original prompt from AJ Steers
@Devin - Several java connectors are no longer getting a 'latest' tag update in dockerhub. This means we are in some cases using older versions of the connector. Find the code in PyAirbyte that finds/runs docker images and use the connector registry to detect the latest version. Run with that explicit version if no other version is requested, instead of falling back to 'latest'. If you cannot find the connector in the JSON registry, such as if the connector is not registered or if we are using OFFLINE_MODE=true, then you can still try to use 'latest' (if that's what we're doing today) but you should do so while printing a warning that we are falling back to using 'latest' due to not being able to detect a valid latest version number.

Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1755120610-fix-docker-latest-tags' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1755120610-fix-docker-latest-tags'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where Java connectors may use outdated versions due to missing 'latest' tags in DockerHub by implementing explicit version resolution from the connector registry.

  • Adds version resolution logic that fetches latest_available_version from registry metadata when no version is specified
  • Replaces 'latest' tag usage with explicit version numbers in Docker image tags
  • Implements graceful fallback to 'latest' with warning messages for offline mode or missing registry data

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

github-actions bot commented Aug 13, 2025

PyTest Results (Fast Tests Only, No Creds)

301 tests  ±0   301 ✅ ±0   4m 17s ⏱️ -1s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit ec6b3aa. ± Comparison against base commit 07e843e.

♻️ This comment has been updated with latest results.

Copy link
Contributor

coderabbitai bot commented Aug 13, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a module-level logger and refactors Docker image tag resolution in airbyte/_executors/util.py: if no explicit tag is given, prefer provided version; else use metadata.latest_available_version when available; otherwise fall back to "latest" and emit a warning via the new logger. Other install flows unchanged.

Changes

Cohort / File(s) Summary
Executor util changes
airbyte/_executors/util.py
Added logger = get_global_file_logger(); when docker_image lacks an explicit tag, resolve tag by: 1) using provided version if present, 2) using metadata.latest_available_version if version is None and metadata available, 3) falling back to "latest" and logging a warning when no metadata/latest available. Only the docker tag resolution path was modified; other install flows remain unchanged.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant util as util.resolve_docker_image_tag
    participant Metadata

    Caller->>util: request docker_image (no explicit tag)
    util->>util: check provided version
    alt version present
        util-->>Caller: return docker_image:version
    else version missing
        util->>Metadata: fetch metadata
        alt metadata.latest_available_version present
            util-->>Caller: return docker_image:metadata.latest_available_version
        else metadata missing or no latest_available_version
            util->>util: logger.warn("falling back to 'latest'")
            util-->>Caller: return docker_image:latest
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Possibly related PRs

Suggested reviewers

  • bnchrch
  • maxi297

Would you like me to also scan for other modules that reference this util function to ensure consistent behavior across callers, wdyt?

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1755120610-fix-docker-latest-tags

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
airbyte/_executors/util.py (1)

268-280: Bug: colon-based tag check breaks for images with a registry port (e.g., host:5000/repo/image).

The condition if version is not None and ":" in docker_image: will incorrectly treat images from registries with a port (e.g., my-registry:5000/airbyte/source-faker) as if they already include a tag. This will:

  • Wrongly raise PyAirbyteInputError when the user supplies version along with a ported image without a tag.
  • Bypass your new version resolution logic, implicitly relying on Docker’s default “latest” tag.

Could we detect tags/digests only in the last path segment (and consider @digest) to disambiguate ports, wdyt?

Proposed fix:

-        if version is not None and ":" in docker_image:
+        # Detect explicit tag/digest ignoring registry ports (e.g., ghcr.io:443/..., localhost:5000/...)
+        name_segment = docker_image.rsplit("/", 1)[-1]
+        has_tag = ":" in name_segment
+        has_digest = "@" in docker_image  # e.g., image@sha256:...
+        if version is not None and (has_tag or has_digest):
             raise exc.PyAirbyteInputError(
                 message="The 'version' parameter is not supported when a tag is already set in the "
                 "'docker_image' parameter.",
                 context={
                     "docker_image": docker_image,
                     "version": version,
                 },
             )
🧹 Nitpick comments (3)
airbyte/_executors/util.py (3)

31-31: LGTM on module-level logger.

Using a per-module logger is the right approach for library code. Do you also want to consider adding a NullHandler at the package root so library users don’t see “No handler found” warnings if they haven’t configured logging, wdyt?


283-304: Optional: tailor the warning message when AIRBYTE_OFFLINE_MODE is enabled.

Right now the same “could not be determined from the registry” message is used both for offline mode and unknown metadata. Would it reduce confusion to explicitly mention “offline mode is enabled; skipping registry lookup” when AIRBYTE_OFFLINE_MODE is true, wdyt?


268-304: Add unit tests for edge cases (ports/digests/offline).

Given environment constraints, could we cover the decision logic with lightweight unit tests that mock registry and environment, wdyt? Suggested cases:

  • docker_image="airbyte/source-faker" with no version: picks metadata.latest_available_version.
  • AIRBYTE_OFFLINE_MODE=1: falls back to VERSION_LATEST and logs the offline warning.
  • docker_image="my-registry:5000/airbyte/source-faker" with no version: still appends resolved version (port should not be treated as a tag).
  • docker_image="airbyte/source-faker@sha256:…" with or without version: does not append version.
  • docker_image="airbyte/source-faker:1.2.3" with version provided: raises PyAirbyteInputError.
  • docker_image="my-registry:5000/airbyte/source-faker" with version="1.2.3": appends ":1.2.3".

I can draft pytest-style tests that monkeypatch get_connector_metadata, set AIRBYTE_OFFLINE_MODE, and assert the constructed docker_cmd args without invoking Docker. Want me to put together a test scaffold, wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07e843e and 79f33a4.

📒 Files selected for processing (1)
  • airbyte/_executors/util.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte/_executors/util.py (1)
airbyte/_connector_base.py (1)
  • name (84-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (2)
airbyte/_executors/util.py (2)

5-5: LGTM on introducing logging.

Importing logging at the module level is appropriate for the new warnings you added. No concerns here.


285-287: Verify ConnectorMetadata.latest_available_version — exists and is sourced from registry

Quick summary of findings from the repo search:

  • airbyte/sources/registry.py
    • ConnectorMetadata defines latest_available_version: str | None (around line 63).
    • _registry_entry_to_connector_metadata(...) sets latest_available_version = entry.get("dockerImageTag") when building metadata (around lines 135–139).
  • airbyte/_executors/util.py (around lines 282–289) uses metadata.latest_available_version as the fallback when resolved_version is None.
  • airbyte/_executors/base.py (around lines 181–186) uses metadata.latest_available_version when target_version == _LATEST_VERSION.
  • tests/integration_tests/test_source_test_fixture.py assert and parametrize behavior that expects latest_available_version to be populated (e.g., "0.0.1" in the fixture).

Conclusion and question: the attribute exists and is populated from the registry entry's dockerImageTag, so connectors that have a dockerImageTag (typical Docker-based connectors) will get a value; connectors without that tag will have None. Could you confirm that Java connectors in your registry include dockerImageTag (or that you’re OK with None for those), or should we add/consider a different source for Java connector versions? wdyt?

from airbyte._executors.base import Executor


logger = logging.getLogger(__name__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get a logger from the logs module.

devin-ai-integration bot and others added 2 commits August 13, 2025 21:48
…ning conditions

- Replace standard logging.getLogger() with get_global_file_logger() from airbyte.logs
- Separate offline mode and missing metadata warning conditions for clarity
- Make warning messages more specific for different failure scenarios
- Fix line length violations to meet project standards

Co-Authored-By: AJ Steers <[email protected]>
- Remove separate handling of OFFLINE_MODE in warning conditions
- Use single generic warning when falling back to 'latest' because no version was requested and we couldn't locate the latest version
- Maintain core functionality of using explicit registry versions when available

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
airbyte/_executors/util.py (1)

272-296: Fix tag/digest detection (ports misclassified), use VERSION_LATEST, and resolve lint E501.

  • The test if ":" not in docker_image misclassifies images from registries with ports (e.g. my-reg:5000/airbyte/source) and also doesn’t account for digests. This can skip your explicit tag resolution and undermine the PR’s goal.
  • Reuse a single detection for has_tag and has_digest, and use it both where you validate version conflicts and where you decide to append a tag.
  • Prefer VERSION_LATEST constant over the string literal "latest".
  • Address the Ruff E501 failure at Line 292 by wrapping the warning string.
  • Optional but clearer: tailor the warning for offline mode vs registry/metadata issues (matches prior review suggestions), wdyt?

Proposed patch:

@@
-        if version is not None and ":" in docker_image:
+        # Detect whether the image already includes an explicit tag or digest.
+        last_slash = docker_image.rfind("/")
+        has_digest = "@" in docker_image
+        has_tag = (":" in docker_image[last_slash + 1:]) and not has_digest
+
+        if version is not None and (has_tag or has_digest):
             raise exc.PyAirbyteInputError(
                 message="The 'version' parameter is not supported when a tag is already set in the "
                 "'docker_image' parameter.",
                 context={
                     "docker_image": docker_image,
                     "version": version,
                 },
             )
@@
-        if ":" not in docker_image:
+        if not (has_tag or has_digest):
             resolved_version = version
             if resolved_version is None:
                 if metadata and metadata.latest_available_version:
                     resolved_version = metadata.latest_available_version
                 else:
-                    resolved_version = "latest"
-                    if logger:
-                        logger.warning(
-                            f"Using 'latest' tag for connector '{name}' because no explicit "
-                            f"version was specified and we could not locate the latest version number."
-                        )
+                    resolved_version = VERSION_LATEST
+                    # Lazily get a logger to avoid import-time side effects.
+                    local_logger = logger or get_global_file_logger()
+                    if local_logger:
+                        # Provide a more specific warning depending on the condition and keep lines <= 100 chars.
+                        if AIRBYTE_OFFLINE_MODE:
+                            msg = (
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because PyAirbyte "
+                                "is running in offline mode and no explicit version was specified."
+                            )
+                        elif metadata is None:
+                            msg = (
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because the registry "
+                                "could not be reached and no explicit version was specified."
+                            )
+                        else:
+                            msg = (
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because no explicit "
+                                "version was specified and no latest version was found in the registry."
+                            )
+                        local_logger.warning(msg)
 
             docker_image = f"{docker_image}:{resolved_version}"

This also resolves the pipeline error Ruff E501 line too long (Line 292) by wrapping the message. Wdyt?

🧹 Nitpick comments (1)
airbyte/_executors/util.py (1)

31-31: Consider lazy-initializing the logger to avoid import-time side effects, wdyt?

get_global_file_logger writes to disk and prints a message to stderr when configured. Doing this at import time can introduce unintended side effects. Would you consider lazy-initializing at the first log site instead? You’re already guarding with an existence check, so a localized lazy lookup is straightforward.

Example inside the warning site (no global reassignment needed):

-                    if logger:
-                        logger.warning(
+                    local_logger = logger or get_global_file_logger()
+                    if local_logger:
+                        local_logger.warning(
                             f"Using 'latest' tag for connector '{name}' because no explicit "
                             f"version was specified and we could not locate the latest version number."
                         )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79f33a4 and da2445d.

📒 Files selected for processing (1)
  • airbyte/_executors/util.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte/_executors/util.py (2)
airbyte/logs.py (1)
  • get_global_file_logger (126-200)
airbyte/_connector_base.py (1)
  • name (84-86)
🪛 GitHub Actions: Run Linters
airbyte/_executors/util.py

[error] 292-292: Ruff: E501 line too long (103 > 100). Command: 'poetry run ruff check .'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (1)
airbyte/_executors/util.py (1)

22-22: Good move: using the project logger source.

Importing get_global_file_logger centralizes logging behavior and is consistent with the rest of the codebase. LGTM.

Copy link

github-actions bot commented Aug 13, 2025

PyTest Results (Full)

364 tests  ±0   350 ✅ ±0   21m 35s ⏱️ + 3m 3s
  1 suites ±0    14 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit ec6b3aa. ± Comparison against base commit 07e843e.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
airbyte/_executors/util.py (2)

272-297: Bug: tag detection misclassifies images with registry ports and digest refs; also prefer VERSION_LATEST and clearer warning reasons.

  • ":" in docker_image treats registry ports (e.g., ghcr.io:443/airbyte/source) as if a tag is set, skipping explicit version resolution. That undermines the PR’s goal and can accidentally use Docker’s implicit latest without warning.
  • Digest refs (image@sha256:...) should be treated as “explicit reference present” and skip adding a tag; current logic doesn’t account for that.
  • Use VERSION_LATEST instead of the literal "latest".
  • Split warnings for offline mode vs metadata unavailability for clarity (as suggested earlier).

Proposed fix (includes robust reference parsing and clearer warnings):

@@
-        if version is not None and ":" in docker_image:
+        # Determine whether the docker reference already has an explicit tag or digest.
+        has_tag, has_digest = _docker_ref_has_explicit_reference(docker_image)
+        if version is not None and (has_tag or has_digest):
             raise exc.PyAirbyteInputError(
                 message="The 'version' parameter is not supported when a tag is already set in the "
                 "'docker_image' parameter.",
                 context={
                     "docker_image": docker_image,
                     "version": version,
                 },
             )
@@
-        if ":" not in docker_image:
+        if not (has_tag or has_digest):
             resolved_version = version
             if resolved_version is None:
-                if metadata and metadata.latest_available_version:
+                if metadata and getattr(metadata, "latest_available_version", None):
                     resolved_version = metadata.latest_available_version
                 else:
-                    resolved_version = "latest"
-                    if logger:
-                        logger.warning(
-                            f"Using 'latest' tag for connector '{name}' because no explicit "
-                            f"version was specified and we could not locate the latest "
-                            f"version number."
-                        )
+                    resolved_version = VERSION_LATEST
+                    if logger:
+                        if AIRBYTE_OFFLINE_MODE:
+                            logger.warning(
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because no explicit "
+                                f"version was specified and registry lookups are disabled in offline mode. "
+                                f"Consider specifying an explicit version or disabling offline mode to fetch "
+                                f"the latest version from the registry."
+                            )
+                        elif metadata is None:
+                            logger.warning(
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because no explicit "
+                                f"version was specified and connector metadata is unavailable. "
+                                f"Consider specifying an explicit version or check your internet connection."
+                            )
+                        else:
+                            logger.warning(
+                                f"Using '{VERSION_LATEST}' tag for connector '{name}' because no explicit "
+                                f"version was specified and the registry did not provide a latest_available_version."
+                            )
 
             docker_image = f"{docker_image}:{resolved_version}"

Helper to add (outside this range), to robustly detect tags/digests (colon after last slash means a tag; “@” means digest):

def _docker_ref_has_explicit_reference(image: str) -> tuple[bool, bool]:
    """Return (has_tag, has_digest) for a docker image reference.

    - Treat colon after the last slash as a tag (e.g., repo/name:1.2.3).
    - Treat '@' as digest (e.g., repo/name@sha256:...).
    - Avoid misclassifying registry ports (e.g., ghcr.io:443/repo/name).
    """
    if "@" in image:
        return False, True
    last_slash = image.rfind("/")
    last_colon = image.rfind(":")
    has_tag = last_colon > last_slash  # colon after last slash => tag, not registry port
    return has_tag, False

This change ensures we only append a tag when neither a tag nor a digest is present, while still allowing registries with ports. It also clarifies the fallback reason for users and uses the VERSION_LATEST constant. Wdyt?


288-294: Use the VERSION_LATEST constant instead of the string literal and tailor the warning.

Minor nit: please switch "latest" to VERSION_LATEST and consider distinguishing offline mode vs missing metadata to improve operator clarity, wdyt?

🧹 Nitpick comments (1)
airbyte/_executors/util.py (1)

31-31: Consider lazy/fallback logging to avoid silent warnings when logger is None.

get_global_file_logger() can return None; the current code no-ops in that case. Do we want a minimal stderr fallback to ensure users see the warning when a file logger isn’t configured, wdyt?

For example, you could centralize warn emission:

def _warn(msg: str) -> None:
    if logger:
        logger.warning(msg)
    else:
        print(f"[warn] {msg}", file=sys.stderr)

…and then call _warn(...) below.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da2445d and ec6b3aa.

📒 Files selected for processing (1)
  • airbyte/_executors/util.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
airbyte/_executors/util.py (2)
airbyte/logs.py (1)
  • get_global_file_logger (126-200)
airbyte/_connector_base.py (1)
  • name (84-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/_executors/util.py (1)

22-22: Good call: using the central logger source.

Importing get_global_file_logger from airbyte.logs aligns with the project’s logging entrypoint. LGTM.

@aaronsteers aaronsteers enabled auto-merge (squash) August 13, 2025 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant