Skip to content

Fix SSL certificate verification issue in provider data fetching #2821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Fix SSL Certificate Verification Issue in Provider Data Fetching

Issue

This PR fixes issue #2820 where the crewai create crew command fails with an SSL certificate verification error when fetching provider data from the LiteLLM repository.

The error occurs when trying to access:

https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

With the error message:

Error fetching provider data: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /BerriAI/litellm/main/model_prices_and_context_window.json (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1000)')))

Solution

Added a new --skip_ssl_verify flag to the crewai create crew command that allows users to bypass SSL certificate verification when fetching provider data. This is particularly useful in environments with self-signed certificates or SSL inspection proxies.

Changes

  1. Modified fetch_provider_data() in provider.py to accept a skip_ssl_verify parameter
  2. Updated get_provider_data() and load_provider_data() to pass the parameter through
  3. Added a --skip_ssl_verify flag to the create command in cli.py
  4. Updated create_crew() in create_crew.py to accept and use the parameter
  5. Added a warning message when SSL verification is disabled
  6. Added comprehensive tests for the new functionality

Testing

Added new test files:

  • tests/cli/provider_test.py: Tests for the provider module with SSL verification bypass
  • tests/cli/create_test.py: Tests for the CLI command with the new flag

All tests pass successfully.

Security Considerations

When SSL verification is disabled, a warning message is displayed to inform users about the security implications.

Link to Devin run

https://app.devin.ai/sessions/922e208b391d4a789bce34b1fd290d72

Requested by

Joe Moura ([email protected])

Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@joaomdmoura
Copy link
Collaborator

Disclaimer: This review was made by a crew of AI Agents.

Code Review Comment for PR #2821

Overview

This pull request effectively addresses SSL certificate verification issues during provider data fetching and introduces substantial test coverage. Below is a detailed analysis of the changes along with recommendations for improvements.

Key Changes

  1. SSL Verification Flag Addition: The introduction of the --skip_ssl_verify parameter in CLI commands is a significant enhancement. This allows users to choose whether to bypass SSL certificate verification. However, clear documentation and caution are necessary to avoid misuse.

  2. Error Handling Improvements: I appreciate the enhanced error handling in the fetch_provider_data function, but specific error handling for different exceptions such as requests.Timeout, requests.SSLError, and requests.RequestException would make it more robust.

    Current Implementation:

    def fetch_provider_data(cache_file, skip_ssl_verify=False):
        try:
            response = requests.get(JSON_URL, stream=True, timeout=60, verify=not skip_ssl_verify)
            response.raise_for_status()

    Suggestion:

    def fetch_provider_data(cache_file, skip_ssl_verify=False):
        try:
            response = requests.get(JSON_URL, stream=True, timeout=60, verify=not skip_ssl_verify)
            response.raise_for_status()
        except requests.Timeout:
            click.secho("Timeout while fetching provider data", fg="red")
            return None
        except requests.SSLError:
            click.secho("SSL verification failed", fg="red")
            return None
        except requests.RequestException as e:
            click.secho(f"Error fetching provider data: {str(e)}", fg="red")
            return None
  3. Cache File Management: Consider improving the error handling in cache file management. The provided implementation could fail without notifying the user if the cache file is invalid or corrupt.

    Current Implementation:

    def read_cache_file(cache_file):
        try:
            if not cache_file.exists():
                return None
            with open(cache_file, "r") as f:
                data = json.load(f)
                if not isinstance(data, dict):
                    click.secho("Invalid cache file format", fg="yellow")
                    return None
                return data

    Suggestion:

    def read_cache_file(cache_file):
        try:
            if not cache_file.exists():
                return None
            with open(cache_file, "r") as f:
                data = json.load(f)
                if not isinstance(data, dict):
                    click.secho("Invalid cache file format", fg="yellow")
                    return None
                return data
        except (json.JSONDecodeError, OSError) as e:
            click.secho(f"Error reading cache file: {str(e)}", fg="yellow")
            return None

Test Coverage Insights

While the new tests have improved coverage significantly, it is crucial to address edge cases for cache file corruption, network timeouts, and SSL certificate validation failures. Recommended tests include:

  • Validating behavior during network failures.
  • Testing the application’s response to corrupted cache files.

Documentation Updates

The documentation should include clear instructions regarding the --skip_ssl_verify flag usage:

## SSL Verification
The `--skip_ssl_verify` flag can be used to bypass SSL certificate verification:
```bash
crewai create crew mycrew --skip_ssl_verify

Warning: Using this flag is not recommended in production environments and can expose the system to security risks.


## Security Considerations
- Ensure that all security-related warnings are clearly documented.
- Suggest including logging when the `--skip_ssl_verify` flag is used to track potential security breaches.

## Recommendations
1. **Enhance Documentation**: Clearly document the implications of the new `--skip_ssl_verify` parameter.
2. **Code Refactoring**: Consider centralizing SSL-related logic in a designated module for better organization.
3. **Expand Test Suites**: Encourage adding tests for edge scenarios to cover SSL certificate handling and cache management.
4. **Improve Error Notifications**: Provide detailed feedback to users in case of errors, especially concerning SSL verification.

## Overall Assessment
The enhancements made in this PR present significant improvements to the SSL certificate verification process while maintaining a good coding standard. Addressing the suggested improvements will enhance both security and user experience. The code is ready for merging upon these adjustments being made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant