Skip to content

Conversation

aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Sep 19, 2025

Resolves an issue where PyAirbyte would crash if an explicit project or install directory was specified that did not yet exist.

Important

Auto-merge enabled.

This PR is set to merge automatically when all requirements are met.

Summary by CodeRabbit

  • New Features

    • Automatically creates missing project and install directories and resolves them to absolute paths for smoother setup.
    • Emits clear warnings when directory creation fails.
  • Improvements

    • More robust default directory handling, reducing setup friction.
    • Adds logging for directory-related warnings to aid troubleshooting.
  • Documentation

    • Updated guidance to reflect automatic directory creation and behavior for project/install paths.

Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@aj/fix/resolve-missing-project-dir-error' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@aj/fix/resolve-missing-project-dir-error'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@aaronsteers aaronsteers enabled auto-merge (squash) September 19, 2025 20:15
Copy link

github-actions bot commented Sep 19, 2025

PyTest Results (Fast Tests Only, No Creds)

301 tests  ±0   301 ✅ ±0   4m 24s ⏱️ -1s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit b9acd21. ± Comparison against base commit 7560c51.

♻️ This comment has been updated with latest results.

Copy link
Contributor

coderabbitai bot commented Sep 19, 2025

📝 Walkthrough

Walkthrough

Added a module-level logger and a helper function that resolves and optionally creates directories; DEFAULT_PROJECT_DIR and DEFAULT_INSTALL_DIR are now initialized via that helper. Docstrings updated and logging used for warnings. No internal column constants were removed or changed.

Changes

Cohort / File(s) Summary
Constants module updates
airbyte/constants.py
- Add import logging and logger = logging.getLogger("airbyte").
- Add helper _try_create_dir_if_missing(path: Path, desc: str = "specified") -> Path to expand, resolve, and create directories if missing; logs warnings on failure or when a path is not a directory.
- Initialize DEFAULT_PROJECT_DIR via the helper (desc "project") instead of directly from env/CWD.
- Initialize DEFAULT_INSTALL_DIR via the helper (desc "install") using AIRBYTE_INSTALL_DIR or the project dir.
- Update docstrings to document automatic creation behavior and logging; no changes to internal column constants.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Importer as Python Importer
  participant Constants as airbyte/constants.py
  participant Helper as _try_create_dir_if_missing
  participant FS as Filesystem
  participant Logger as logging

  Importer->>Constants: import constants
  activate Constants
  Note right of Constants: module-level logger created
  Constants->>Logger: logger.getLogger("airbyte")
  Logger-->>Constants: logger

  Constants->>Helper: init DEFAULT_PROJECT_DIR (desc="project")
  activate Helper
  Helper->>FS: expanduser, resolve path
  FS-->>Helper: path info
  Helper->>FS: create dir if missing
  FS-->>Helper: success/failure
  Helper->>Logger: warn on failure or if not a dir
  Helper-->>Constants: resolved Path
  deactivate Helper

  Constants->>Helper: init DEFAULT_INSTALL_DIR (desc="install")
  activate Helper
  Helper->>FS: expanduser, resolve path
  FS-->>Helper: path info
  Helper->>FS: create dir if missing
  FS-->>Helper: success/failure
  Helper->>Logger: warn on failure or if not a dir
  Helper-->>Constants: resolved Path
  deactivate Helper

  Constants-->>Importer: exported constants ready
  deactivate Constants
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Would you like me to suggest a small unit test to cover the helper’s behavior (creation, existing dir, non-directory error), wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title accurately summarizes the primary change: adding behavior to auto-create specified project and install directories when they are missing, which matches the new _try_create_dir_if_missing helper and the updated DEFAULT_PROJECT_DIR/DEFAULT_INSTALL_DIR initializations in airbyte/constants.py; it is specific, clear, and relevant to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch aj/fix/resolve-missing-project-dir-error

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
airbyte/constants.py (4)

7-7: Prefer logging over printing to stderr.

Switch to logging (and drop sys) for consistent, configurable observability.

-import sys
+import logging

wdyt?


68-71: Sanitize env var and avoid redundant absolute/expand at call site.

Let the helper handle expanduser()/resolve(); also trim whitespace so " " doesn't become an invalid path.

-DEFAULT_PROJECT_DIR: Path = _try_create_dir_if_missing(
-    Path(os.getenv("AIRBYTE_PROJECT_DIR", "") or Path.cwd()).expanduser().absolute(),
-    desc="project",
-)
+DEFAULT_PROJECT_DIR: Path = _try_create_dir_if_missing(
+    Path(os.getenv("AIRBYTE_PROJECT_DIR", "").strip() or Path.cwd()),
+    desc="project",
+)

wdyt?


85-95: Same cleanup for install dir.

Mirror the project dir changes for consistency.

-DEFAULT_INSTALL_DIR: Path = _try_create_dir_if_missing(
-    Path(os.getenv("AIRBYTE_INSTALL_DIR", "") or DEFAULT_PROJECT_DIR).expanduser().absolute(),
-    desc="install",
-)
+DEFAULT_INSTALL_DIR: Path = _try_create_dir_if_missing(
+    Path(os.getenv("AIRBYTE_INSTALL_DIR", "").strip() or DEFAULT_PROJECT_DIR),
+    desc="install",
+)

wdyt?


68-71: Consider gating import-time I/O.

Creating directories during module import can surprise consumers (e.g., read-only FS, import-time side effects). Would you consider creating only when an env var is explicitly set or performing creation lazily at first use, behind a small helper (e.g., get_project_dir()/get_install_dir()), while keeping the current behavior under a feature flag like AIRBYTE_AUTO_CREATE_DIRS=1? wdyt?

Also applies to: 85-95

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 503b124 and 16d4da2.

📒 Files selected for processing (1)
  • airbyte/constants.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@aaronsteers
Copy link
Contributor Author

aaronsteers commented Sep 19, 2025

/fix-pr

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.
(This job requires that the PR author has "Allow edits from maintainers" enabled.)

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
airbyte/constants.py (5)

14-14: Unused DEBUG_MODE constant.

This isn’t referenced; remove it or wire it into debug logs to avoid dead code, wdyt?


43-58: File-collision should fail fast; add debug on successful create.

If the path exists as a file, returning it will defer a harder-to-diagnose error downstream. Shall we raise NotADirectoryError and also emit a debug log on successful auto-create? wdyt?

@@
-    try:
-        if resolved_path.exists():
-            if not resolved_path.is_dir():
-                logger.warning(
-                    "The %s path exists but is not a directory: '%s'", desc, resolved_path
-                )
-            return resolved_path
-        resolved_path.mkdir(parents=True, exist_ok=True)
+    try:
+        if resolved_path.exists():
+            if not resolved_path.is_dir():
+                raise NotADirectoryError(
+                    f"The {desc} path exists but is not a directory: '{resolved_path}'"
+                )
+            return resolved_path
+        resolved_path.mkdir(parents=True, exist_ok=True)
+        logger.debug("Auto-created missing %s directory at '%s'", desc, resolved_path)

61-64: Drop redundant expand/absolute at call site; strip env var whitespace.

The helper already does expanduser().resolve(). Can we simplify and also guard against whitespace-only env values? wdyt?

-DEFAULT_PROJECT_DIR: Path = _try_create_dir_if_missing(
-    Path(os.getenv("AIRBYTE_PROJECT_DIR", "") or Path.cwd()).expanduser().absolute(),
-    desc="project",
-)
+DEFAULT_PROJECT_DIR: Path = _try_create_dir_if_missing(
+    Path(os.getenv("AIRBYTE_PROJECT_DIR", "").strip() or Path.cwd()),
+    desc="project",
+)

78-81: Same cleanup for install dir.

Match the project dir simplification and whitespace handling, wdyt?

-DEFAULT_INSTALL_DIR: Path = _try_create_dir_if_missing(
-    Path(os.getenv("AIRBYTE_INSTALL_DIR", "") or DEFAULT_PROJECT_DIR).expanduser().absolute(),
-    desc="install",
-)
+DEFAULT_INSTALL_DIR: Path = _try_create_dir_if_missing(
+    Path(os.getenv("AIRBYTE_INSTALL_DIR", "").strip() or DEFAULT_PROJECT_DIR),
+    desc="install",
+)

61-64: Import-time side effects: confirm this is acceptable.

Initializing these constants triggers directory creation during module import when env vars point to missing paths. Is this intentional for all import contexts (e.g., CLIs vs. libraries, read-only filesystems)? If not, should we defer creation until first use or gate via an env like AIRBYTE_AUTO_CREATE_DIRS=true? Happy to sketch a lazy init helper if you’d like, wdyt?

Also applies to: 78-81

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16d4da2 and b9acd21.

📒 Files selected for processing (1)
  • airbyte/constants.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/constants.py (1)

6-11: Logger name: ensure consistency across the package?

Would you prefer logging.getLogger(__name__) here for hierarchical control, or is "airbyte" the established convention elsewhere in the codebase? If mixed, can we standardize? wdyt?

Copy link

PyTest Results (Full)

364 tests  ±0   348 ✅ ±0   20m 31s ⏱️ +20s
  1 suites ±0    16 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit b9acd21. ± Comparison against base commit 7560c51.

@aaronsteers aaronsteers merged commit 6009b69 into main Sep 19, 2025
22 checks passed
@aaronsteers aaronsteers deleted the aj/fix/resolve-missing-project-dir-error branch September 19, 2025 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant