-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[SKV-3992] Add OPENAI_COMPATIBLE for githubcopilot.com #3993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[SKV-3992] Add OPENAI_COMPATIBLE for githubcopilot.com #3993
Conversation
WalkthroughThe changes enhance GitHub Copilot integration within the LLM API handler, extract token-counting logic into reusable methods, introduce a shared image-message detection utility, implement task history truncation for payload optimization, and add screenshot persistence to task execution. Changes
Sequence DiagramssequenceDiagram
participant Handler as LLM API Handler
participant Config as GitHub Copilot Config
participant Caller as LLMCaller
participant API as OpenAI API
Handler->>Handler: Check if OPENAI_COMPATIBLE
Handler->>Config: Read OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN
Handler->>Caller: Initialize with Copilot domain
Note over Caller: Message contains images?
alt Has Images
Caller->>Caller: Add Copilot-Vision-Request header
end
Caller->>API: Send request with headers
API-->>Caller: Response with token usage
Caller->>Handler: Extract token counts
Handler->>Handler: Parse input, output, reasoning, cached tokens
sequenceDiagram
participant Service as Task V2 Service
participant Truncate as Truncation Helper
participant LLM as LLM Handler
Service->>Service: Scrape page
Service->>Service: Save first screenshot to temp dir
Note over Service: Build completion check prompt
Service->>Truncate: _truncate_task_history_for_completion_check
Truncate->>Truncate: Select last N entries
Truncate->>Truncate: _truncate_extracted_data per entry
Truncate-->>Service: Truncated history
Service->>LLM: Send with truncated task_history
LLM-->>Service: Completion check result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20-25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c4baa45 to
3e96487
Compare
3e96487 to
77242ce
Compare
…rFactory._get_check_model calls back to _apply_thinking_budget_optimization
| check_model = llm_config.model_list[0].model_name or model_label # type: ignore[attr-defined] | ||
| except Exception: | ||
| check_model = model_label | ||
| model_label = LLMAPIHandlerFactory._get_model_label(llm_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used in 4 other places
| self.openai_client = AsyncOpenAI(api_key=settings.OPENROUTER_API_KEY, base_url=settings.OPENROUTER_API_BASE) | ||
| elif ( | ||
| self.llm_key == "OPENAI_COMPATIBLE" | ||
| and settings.OPENAI_COMPATIBLE_API_BASE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saw the # OPENAI COMPATIBLE comment and other content related to it there, so added the base URL there
| return _normalize(left) == _normalize(right) | ||
|
|
||
| @staticmethod | ||
| def _extract_token_counts(response: ModelResponse | CustomStreamWrapper) -> tuple[int, int, int, int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saw dupe logic and this as a quick win
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to e3f9bbd in 1 minute and 34 seconds. Click for details.
- Reviewed
608lines of code in6files - Skipped
0files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. skyvern/utils/env_paths.py:9
- Draft comment:
resolve_backend_env_path is straightforward. Consider caching its returned Path if it’s used frequently, as the working directory is unlikely to change. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
2. skyvern/utils/env_paths.py:16
- Draft comment:
resolve_frontend_env_path uses multiple fallbacks to locate the frontend directory. Consider adding logging when no directory is found and caching the result to avoid repeated file system traversal. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
3. skyvern/forge/sdk/api/llm/api_handler_factory.py:26
- Draft comment:
Typo: There is an extraneous closing parenthesis on this line that should be removed. - Reason this comment was not posted:
Comment was on unchanged code.
4. skyvern/forge/sdk/api/llm/api_handler_factory.py:118
- Draft comment:
Typographical inconsistency: the comment uses "litellm" while the docstring refers to "LiteLLM". Please use consistent capitalization. - Reason this comment was not posted:
Comment was on unchanged code.
5. skyvern/services/task_v2_service.py:17
- Draft comment:
It seems that line 17 contains just a stray)which doesn't look intentional. Please verify if this is a typographical error or if additional code was intended on this line. - Reason this comment was not posted:
Comment was on unchanged code.
Workflow ID: wflow_zTAj8PHOc3w0zLCK
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors when interacting with LLM APIs that have strict payload limits.
Key Changes
- GitHub Copilot Integration: Added detection and handling for GitHub Copilot endpoints via
OPENAI_COMPATIBLEconfiguration, including special header support (Copilot-Vision-Request: true) for vision-enabled requests - Payload Optimization: Implemented task history truncation (limiting entries and extracted data size) and reduced screenshot count to 3 to prevent "Request Entity Too Large" errors
- Code Refactoring: Extracted duplicate token counting logic into reusable methods and consolidated image message detection into a shared utility function
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| skyvern/config.py | Added OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN configuration setting for identifying GitHub Copilot endpoints |
| skyvern/utils/env_paths.py | Simplified resolve_backend_env_path by removing unnecessary intermediate variable |
| skyvern/forge/sdk/api/llm/utils.py | Introduced is_image_message() utility function to detect image messages in LLM requests |
| skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py | Refactored to use shared is_image_message() utility instead of local implementation |
| skyvern/forge/sdk/api/llm/api_handler_factory.py | Major changes: extracted _extract_token_counts() helper, added _get_model_label() and _get_check_model() helpers, implemented GitHub Copilot detection and custom header logic, updated cost calculation to skip OPENAI_COMPATIBLE providers |
| skyvern/services/task_v2_service.py | Implemented _truncate_task_history_for_completion_check() and _truncate_extracted_data() functions, limited screenshots to 3, added streaming screenshot save functionality, applied truncation at key points to reduce LLM payload size |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| max_extracted_data_size: int = 1000, | ||
| ) -> dict: | ||
| """Truncate extracted_data field in a single entry to reduce payload size.""" | ||
| entry_copy = entry.copy() |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _truncate_extracted_data function performs a shallow copy of the entry dictionary, which could be problematic if extracted_data contains nested mutable objects (like lists or dicts). If the entry is later modified elsewhere, the truncated version could inadvertently share mutable references with the original.
Consider using copy.deepcopy() or explicitly creating new nested structures to ensure complete isolation:
import copy
def _truncate_extracted_data(
entry: dict,
max_extracted_data_size: int = 1000,
) -> dict:
"""Truncate extracted_data field in a single entry to reduce payload size."""
entry_copy = copy.deepcopy(entry)
if "extracted_data" in entry_copy:
data_str = str(entry_copy["extracted_data"])
if len(data_str) > max_extracted_data_size:
entry_copy["extracted_data"] = data_str[:max_extracted_data_size] + "... [truncated]"
return entry_copyThis ensures that modifications to the truncated history won't affect the original task_history list.
| current_url = current_url if current_url else str(await SkyvernFrame.get_url(frame=page) if page else url) | ||
|
|
||
| # Truncate task_history to reduce payload size (keep last 3 entries for context) | ||
| truncated_task_history = _truncate_task_history_for_completion_check(task_history, max_entries=3) |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The hardcoded value max_entries=3 for task history truncation during completion checks is lower than the default value of 5 used elsewhere (e.g., line 1787). This inconsistency could lead to confusion and maintenance issues.
Consider either:
- Using the same default value consistently across all invocations, or
- Documenting why different max_entries values are needed in different contexts (e.g., completion checks need fewer entries than summarization)
If the different values are intentional, consider extracting them as named constants to make the rationale clearer:
MAX_ENTRIES_FOR_COMPLETION_CHECK = 3
MAX_ENTRIES_FOR_SUMMARIZATION = 5| if ( | ||
| llm_key == "OPENAI_COMPATIBLE" | ||
| and settings.OPENAI_COMPATIBLE_API_BASE | ||
| and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE | ||
| ): |
Copilot
AI
Nov 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GitHub Copilot detection logic is duplicated in multiple places (lines 607-611 and lines 959-963, 1243-1246). This violates the DRY principle and makes maintenance harder - if the detection logic needs to change, it must be updated in all three places.
Consider extracting this into a helper method:
@staticmethod
def _is_github_copilot_endpoint() -> bool:
"""Check if the OPENAI_COMPATIBLE endpoint is GitHub Copilot."""
return (
settings.OPENAI_COMPATIBLE_API_BASE is not None
and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE
)Then use it consistently:
if llm_key == "OPENAI_COMPATIBLE" and LLMAPIHandlerFactory._is_github_copilot_endpoint():
# GitHub Copilot-specific logicThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to thisif people feel its needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
skyvern/forge/sdk/api/llm/api_handler_factory.py (1)
959-968: Same domain detection concern applies here.The Copilot endpoint initialization correctly mirrors the OpenRouter pattern, but the domain detection logic has the same substring matching issue noted in the
get_llm_api_handlerreview. If the URL parsing improvement is implemented there, it should be applied consistently here as well.
🧹 Nitpick comments (2)
skyvern/services/task_v2_service.py (1)
1664-1698: Consider deep copy for nested structures.The truncation functions work well for their purpose, but
entry.copy()at line 1669 creates a shallow copy. Ifextracted_datacontains nested dictionaries or lists, modifications could affect the original entries.Consider using
copy.deepcopy()if nested structures are common:+import copy + def _truncate_extracted_data( entry: dict, max_extracted_data_size: int = 1000, ) -> dict: """Truncate extracted_data field in a single entry to reduce payload size.""" - entry_copy = entry.copy() + entry_copy = copy.deepcopy(entry) if "extracted_data" in entry_copy: data_str = str(entry_copy["extracted_data"]) if len(data_str) > max_extracted_data_size: entry_copy["extracted_data"] = data_str[:max_extracted_data_size] + "... [truncated]" return entry_copyskyvern/forge/sdk/api/llm/api_handler_factory.py (1)
606-613: Consider more robust URL parsing for domain detection.The current implementation uses a simple substring check (
settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE). This could match unintended URLs likehttps://not-githubcopilot.com.attacker.com/api.Consider using proper URL parsing for more precise domain matching:
+from urllib.parse import urlparse + # For GitHub Copilot via OPENAI_COMPATIBLE, use LLMCaller for custom header support +def _is_copilot_endpoint() -> bool: + if not settings.OPENAI_COMPATIBLE_API_BASE: + return False + try: + parsed = urlparse(settings.OPENAI_COMPATIBLE_API_BASE) + hostname = parsed.hostname or "" + return settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in hostname + except Exception: + return False + if ( llm_key == "OPENAI_COMPATIBLE" - and settings.OPENAI_COMPATIBLE_API_BASE - and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE + and _is_copilot_endpoint() ): llm_caller = LLMCaller(llm_key=llm_key, base_parameters=base_parameters) return llm_caller.call
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
skyvern/config.py(1 hunks)skyvern/forge/sdk/api/llm/api_handler_factory.py(11 hunks)skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py(3 hunks)skyvern/forge/sdk/api/llm/utils.py(1 hunks)skyvern/services/task_v2_service.py(6 hunks)skyvern/utils/env_paths.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
skyvern/services/task_v2_service.py (2)
skyvern/forge/sdk/api/files.py (1)
get_skyvern_temp_dir(320-323)skyvern/utils/prompt_engine.py (1)
load_prompt_with_elements(40-85)
skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py (1)
skyvern/forge/sdk/api/llm/utils.py (1)
is_image_message(18-24)
🪛 Ruff (0.14.5)
skyvern/services/task_v2_service.py
712-712: Do not catch blind exception: Exception
(BLE001)
skyvern/forge/sdk/api/llm/api_handler_factory.py
251-251: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: CodeQL analysis (python)
- GitHub Check: Agent
🔇 Additional comments (14)
skyvern/utils/env_paths.py (1)
13-13: LGTM! Clean simplification.The removal of the intermediate
env_pathvariable makes the code more concise without affecting functionality.skyvern/services/task_v2_service.py (4)
4-4: LGTM! Necessary imports for screenshot streaming.The
Pathandget_skyvern_temp_dirimports are properly used in the new screenshot persistence logic at lines 700-701.Also applies to: 20-20
692-692: LGTM! Payload optimization for LLM API limits.The
max_screenshot_number=3parameter appropriately limits screenshots to reduce payload size, which aligns with the PR's goal to support APIs with strict limits like GitHub Copilot.
697-718: LGTM! Appropriate error handling for non-critical operation.The screenshot streaming logic properly:
- Uses the temp directory utility for path construction
- Creates directories safely with
parents=True, exist_ok=True- Handles exceptions gracefully with warning logs
The broad exception catch at line 712 is appropriate here since screenshot streaming is a non-critical enhancement that should not fail the main task execution flow.
726-734: LGTM! Appropriate truncation strategy for different contexts.The truncation is applied thoughtfully:
- Line 727: More aggressive truncation (
max_entries=3) for iteration prompts to keep payload lean during active execution- Line 1787: Less aggressive truncation (
max_entries=5) for final summarization where more context is beneficial- Line 995: Correctly reuses the truncated history from the completion check context
This differentiated approach balances payload size with context quality based on the use case.
Also applies to: 1786-1791, 995-995
skyvern/config.py (1)
204-204: LGTM! Clear configuration for Copilot domain detection.The new
OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAINsetting follows the existing naming convention and provides a clean way to identify GitHub Copilot endpoints. The default value is appropriate for domain matching.skyvern/forge/sdk/api/llm/utils.py (1)
18-24: LGTM! Clean utility for image message detection.The
is_image_messagefunction provides a centralized, reusable way to detect image-containing messages. The implementation uses appropriate defensive checks and follows the OpenAI message format structure.skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py (1)
24-24: LGTM! Good refactoring to eliminate duplication.Replacing the local
_is_image_messageimplementation with the sharedis_image_messageutility fromutils.pyreduces code duplication and centralizes image message detection logic.Also applies to: 108-108, 125-125
skyvern/forge/sdk/api/llm/api_handler_factory.py (6)
29-34: LGTM! Proper import for image detection utility.The
is_image_messageimport is appropriately added alongside existing utilities and is used correctly in the Copilot header logic at line 1248.
81-107: LGTM! Excellent refactoring to centralize token extraction.The
_extract_token_countshelper consolidates token parsing logic that was previously duplicated across multiple locations. The implementation:
- Uses defensive programming with
getattrand default values- Handles multiple token detail formats (completion_tokens_details, prompt_tokens_details)
- Provides fallback for cached tokens (cache_read_input_tokens)
- Returns a clear tuple of 4 values
This significantly improves maintainability.
235-253: LGTM! Defensive fallback handling is appropriate.The
_get_model_labeland_get_check_modelhelpers properly centralize model name extraction for capability checks. The broad exception catch at line 251 is intentional and appropriate - it provides a safe fallback when extracting the model name from router configurations, ensuring the code doesn't fail on edge cases during capability detection.
496-498: LGTM! Consistent use of centralized token extraction.The
_extract_token_countshelper is used consistently in both handler paths, properly unpacking the tuple into the four token count variables. This eliminates code duplication and improves maintainability.Also applies to: 815-817
1392-1404: LGTM! Proper handling of OPENAI_COMPATIBLE token tracking.The changes correctly:
- Extract token counts for
OPENAI_COMPATIBLEproviders even though cost calculation is deferred- Use the centralized
_extract_token_countshelper consistently- Document the pending cost calculation with a TODO and link to pricing information
The token tracking will provide valuable metrics while cost calculation can be implemented later.
Also applies to: 1436-1438
1242-1252: Header name verified as correct.The header name
Copilot-Vision-Requestis confirmed correct per GitHub Copilot API documentation. The HTTP header is case-insensitive, and its value should be set to "true" to enable vision/image support.The code correctly implements this by setting the header when images are detected. The domain detection substring matching concern noted in the comment remains a valid consideration from earlier review feedback but is separate from this header name verification.
Ticket
SKV-3992
Description
This PR adds support for the GitHub Copilot API as an OpenAI-compatible LLM provider. The implementation includes refactoring the LLM API handler factory to accurately detect and handle GitHub Copilot endpoints, extract token counts, and apply suitable reasoning optimizations. Additionally, task history truncation was implemented to prevent payload size errors when sending large contexts to LLM APIs with strict limits.
How Has This Been Tested?
Pending...
Artifacts (if appropriate):
2025-11-13.12-05-30.mp4
🔧 This PR adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors. The changes include configuration updates for GitHub Copilot domain detection and optimization measures to reduce request sizes when interacting with LLM APIs that have strict payload limits.
🔍 Detailed Analysis
Key Changes
GITHUB_COPILOT_DOMAINsetting to identify GitHub Copilot endpoints as OpenAI-compatible_truncate_task_history_for_completion_check()function to limit task history entries and truncate large extracted data fieldsenv_paths.pyto simplify path resolution logicTechnical Implementation
flowchart TD A[Task V2 Execution] --> B[Scrape Page with Limited Screenshots] B --> C[Save Streaming Screenshot] C --> D[Truncate Task History] D --> E{History > Max Entries?} E -->|Yes| F[Keep Last N Entries] E -->|No| G[Truncate Large Data Fields] F --> H[Truncate Extracted Data] G --> H H --> I[Send to LLM API] I --> J[GitHub Copilot Compatible Processing]Impact
Created with Palmier
Important
Adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors.
api_handler_factory.py.task_v2_service.pyto prevent payload size errors.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAINsetting inconfig.pyto identify GitHub Copilot endpoints._truncate_task_history_for_completion_check()intask_v2_service.pyto limit task history entries._extract_token_counts()inapi_handler_factory.pyto extract token counts from responses.env_paths.pyto simplify path resolution logic.This description was created by
for e3f9bbd. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.