Skip to content

Conversation

devin-ai-integration[bot]
Copy link
Contributor

This PR fixes Windows compatibility issues in the HTTP caching implementation.

Changes

  • Improved path handling in serialization.py using Path.suffix instead of string operations
  • Added better error handling with detailed logging
  • Ensured parent directories are created before file operations
  • Added more robust exception handling

Fixes Windows test failures in PR #646

Link to Devin run: https://app.devin.ai/sessions/9bbcc89c5dc047cabfe064370d8ca798
Requested by: Aaron ("AJ") Steers

Copy link
Contributor Author

🤖 Devin AI Engineer

Original prompt from Aaron:

# Task: Implement HTTP Caching in PyAirbyte

## Background
We need to port the MITM (Man-In-The-Middle) proxy functionality from the Airbyte CI system into the PyAirbyte repository. This functionality allows for caching and replaying HTTP requests, which is useful for working around rate limits and situations where credentials might not be available.

## Requirements

### 1. Create a new module called `http_caching`
- Create a new module in the PyAirbyte repository called `http_caching` (separate from the existing `caches` module)
- The main class should be called `AirbyteConnectorCache`
- Implement the functionality using mitmproxy's Python API (not the command-line interface)

### 2. Configuration
- By default, cache files should be stored in a local directory called `.airbyte-http-cache`
- Create a constant that points to this directory
- Add support for an environment variable to override the cache location
- The primary means of sending traffic to this proxy should be via the HTTP_PROXY environment variable

### 3. Modify `get_source` implementation
- Add a new optional parameter called `http_cache` to the `get_source` function in `airbyte/sources/util.py`
- When this parameter is specified, configure the source to use the HTTP caching functionality

### 4. Core Functionality
The implementation should support:
- Recording HTTP traffic between connectors and sources
- Replaying requests to the same URL from a previous run
- Avoiding rate limiting issues
- Handling certificate management for HTTPS interception
- Proper handling of sensitive data in cached responses

## Reference Files
- `airbyte/sources/util.py` - Contains the `get_source` function that needs modification
- `airbyte/sources/registry.py` - Related to source registration
- `airbyte/__init__.py` - Main module initialization

## Notes
- This is separate from the existing data caching system in `airbyte.caches`
- Do not implement Docker wrapper functionality at this time
- Focus on making a clean Python-based implementation

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add "(aside)" to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration bot and others added 2 commits March 30, 2025 22:56
Copy link
Contributor Author

Devin is currently unreachable - the session may have died.

Copy link
Contributor Author

Closing due to inactivity for more than 7 days.

Copy link
Contributor Author

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants