Skip to content

Conversation

@AronPerez
Copy link

@AronPerez AronPerez commented Nov 13, 2025

Ticket

SKV-3992

Description

This PR adds support for the GitHub Copilot API as an OpenAI-compatible LLM provider. The implementation includes refactoring the LLM API handler factory to accurately detect and handle GitHub Copilot endpoints, extract token counts, and apply suitable reasoning optimizations. Additionally, task history truncation was implemented to prevent payload size errors when sending large contexts to LLM APIs with strict limits.

How Has This Been Tested?

Pending...

Artifacts (if appropriate):

2025-11-13.12-05-30.mp4

🔧 This PR adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors. The changes include configuration updates for GitHub Copilot domain detection and optimization measures to reduce request sizes when interacting with LLM APIs that have strict payload limits.

🔍 Detailed Analysis

Key Changes

  • Configuration: Added GITHUB_COPILOT_DOMAIN setting to identify GitHub Copilot endpoints as OpenAI-compatible
  • Task History Optimization: Implemented _truncate_task_history_for_completion_check() function to limit task history entries and truncate large extracted data fields
  • Screenshot Management: Added streaming screenshot saving functionality and limited screenshots to 3 per scrape to reduce payload size
  • Code Cleanup: Minor refactoring in env_paths.py to simplify path resolution logic

Technical Implementation

flowchart TD
    A[Task V2 Execution] --> B[Scrape Page with Limited Screenshots]
    B --> C[Save Streaming Screenshot]
    C --> D[Truncate Task History]
    D --> E{History > Max Entries?}
    E -->|Yes| F[Keep Last N Entries]
    E -->|No| G[Truncate Large Data Fields]
    F --> H[Truncate Extracted Data]
    G --> H
    H --> I[Send to LLM API]
    I --> J[GitHub Copilot Compatible Processing]
Loading

Impact

  • API Compatibility: Enables GitHub Copilot integration as an OpenAI-compatible provider, expanding LLM options
  • Payload Optimization: Prevents "Request Entity Too Large" errors by intelligently truncating task history and limiting screenshot count
  • Performance Improvement: Reduces memory usage and network overhead by limiting data sent to LLM APIs while maintaining context relevance
  • Streaming Support: Enhanced real-time monitoring capabilities through saved streaming screenshots

Created with Palmier


Important

Adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors.

  • Behavior:
    • Adds support for GitHub Copilot as an OpenAI-compatible LLM provider in api_handler_factory.py.
    • Implements task history truncation in task_v2_service.py to prevent payload size errors.
  • Configuration:
    • Adds OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN setting in config.py to identify GitHub Copilot endpoints.
  • Functions:
    • Adds _truncate_task_history_for_completion_check() in task_v2_service.py to limit task history entries.
    • Adds _extract_token_counts() in api_handler_factory.py to extract token counts from responses.
  • Misc:
    • Minor refactoring in env_paths.py to simplify path resolution logic.

This description was created by Ellipsis for e3f9bbd. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • New Features

    • Added support for GitHub Copilot as an LLM provider with image handling capabilities.
    • Enabled screenshot persistence for improved task debugging and tracking.
  • Improvements

    • Optimized payload size by truncating task history during processing.
    • Enhanced token counting and statistics reporting across LLM calls.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 13, 2025

Walkthrough

The changes enhance GitHub Copilot integration within the LLM API handler, extract token-counting logic into reusable methods, introduce a shared image-message detection utility, implement task history truncation for payload optimization, and add screenshot persistence to task execution.

Changes

Cohort / File(s) Summary
GitHub Copilot Configuration
skyvern/config.py
Added new public configuration field OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN with default value "githubcopilot.com" to Settings class.
LLM API Handler Refactoring
skyvern/forge/sdk/api/llm/api_handler_factory.py
Added three new static helper methods (_extract_token_counts, _get_model_label, _get_check_model) to centralize token extraction and model selection logic. Enhanced GitHub Copilot (OPENAI_COMPATIBLE) support with Copilot-Vision-Request header insertion for image-containing messages. Refactored multiple token-parsing code paths to use centralized extraction.
Image Message Detection Utility
skyvern/forge/sdk/api/llm/utils.py
Added new helper function is_image_message to detect user messages containing image_url content type.
Image Message Utility Adoption
skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py
Replaced private _is_image_message function with centralized is_image_message utility from utils module.
Task History Optimization
skyvern/services/task_v2_service.py
Added streaming screenshot persistence to temp directory and implemented task history truncation via new helper functions (_truncate_extracted_data, _truncate_task_history_for_completion_check). Applied truncation to completion checks, prompt generation, and summarization calls. Added screenshot limit (max_screenshot_number=3) to reduce payload size.
Minor Refactor
skyvern/utils/env_paths.py
Simplified resolve_backend_env_path by removing intermediate variable.

Sequence Diagrams

sequenceDiagram
    participant Handler as LLM API Handler
    participant Config as GitHub Copilot Config
    participant Caller as LLMCaller
    participant API as OpenAI API

    Handler->>Handler: Check if OPENAI_COMPATIBLE
    Handler->>Config: Read OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN
    Handler->>Caller: Initialize with Copilot domain
    
    Note over Caller: Message contains images?
    alt Has Images
        Caller->>Caller: Add Copilot-Vision-Request header
    end
    
    Caller->>API: Send request with headers
    API-->>Caller: Response with token usage
    Caller->>Handler: Extract token counts
    Handler->>Handler: Parse input, output, reasoning, cached tokens
Loading
sequenceDiagram
    participant Service as Task V2 Service
    participant Truncate as Truncation Helper
    participant LLM as LLM Handler

    Service->>Service: Scrape page
    Service->>Service: Save first screenshot to temp dir
    
    Note over Service: Build completion check prompt
    Service->>Truncate: _truncate_task_history_for_completion_check
    Truncate->>Truncate: Select last N entries
    Truncate->>Truncate: _truncate_extracted_data per entry
    Truncate-->>Service: Truncated history
    
    Service->>LLM: Send with truncated task_history
    LLM-->>Service: Completion check result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20-25 minutes

  • Extra attention areas:
    • Verify Copilot-Vision-Request header insertion logic in api_handler_factory.py handles all message types and edge cases correctly
    • Validate token extraction abstraction (_extract_token_counts) against different response formats and models
    • Confirm task history truncation thresholds (max_entries=5, max_extracted_data_size=1000) are appropriate for typical payloads
    • Ensure screenshot persistence in task_v2_service.py properly manages temp directory lifecycle

Possibly related PRs

  • chore: env path refactor #3691: Modifies the same Settings class in skyvern/config.py (this PR adds OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN; that PR changes model_config for resolved env files), creating a direct code-level dependency.

Suggested reviewers

  • wintonzheng

🐰 Hops with glee

A Copilot takes flight with vision clear,
Token counts distilled, now drawing near,
Images detected, truncation refined—
Screenshots saved, no payload left behind! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.62% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main objective—adding GitHub Copilot API support via the OPENAI_COMPATIBLE configuration group. It accurately reflects the primary change across the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AronPerez AronPerez force-pushed the feat/SKV-3992/add-support-for-github-copilot-api branch 3 times, most recently from c4baa45 to 3e96487 Compare November 14, 2025 04:31
@AronPerez AronPerez force-pushed the feat/SKV-3992/add-support-for-github-copilot-api branch from 3e96487 to 77242ce Compare November 23, 2025 16:46
…rFactory._get_check_model calls back to _apply_thinking_budget_optimization
check_model = llm_config.model_list[0].model_name or model_label # type: ignore[attr-defined]
except Exception:
check_model = model_label
model_label = LLMAPIHandlerFactory._get_model_label(llm_config)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used in 4 other places

self.openai_client = AsyncOpenAI(api_key=settings.OPENROUTER_API_KEY, base_url=settings.OPENROUTER_API_BASE)
elif (
self.llm_key == "OPENAI_COMPATIBLE"
and settings.OPENAI_COMPATIBLE_API_BASE
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saw the # OPENAI COMPATIBLE comment and other content related to it there, so added the base URL there

return _normalize(left) == _normalize(right)

@staticmethod
def _extract_token_counts(response: ModelResponse | CustomStreamWrapper) -> tuple[int, int, int, int]:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saw dupe logic and this as a quick win

@AronPerez AronPerez marked this pull request as ready for review November 23, 2025 17:54
Copilot AI review requested due to automatic review settings November 23, 2025 17:54
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to e3f9bbd in 1 minute and 34 seconds. Click for details.
  • Reviewed 608 lines of code in 6 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. skyvern/utils/env_paths.py:9
  • Draft comment:
    resolve_backend_env_path is straightforward. Consider caching its returned Path if it’s used frequently, as the working directory is unlikely to change.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
2. skyvern/utils/env_paths.py:16
  • Draft comment:
    resolve_frontend_env_path uses multiple fallbacks to locate the frontend directory. Consider adding logging when no directory is found and caching the result to avoid repeated file system traversal.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
3. skyvern/forge/sdk/api/llm/api_handler_factory.py:26
  • Draft comment:
    Typo: There is an extraneous closing parenthesis on this line that should be removed.
  • Reason this comment was not posted:
    Comment was on unchanged code.
4. skyvern/forge/sdk/api/llm/api_handler_factory.py:118
  • Draft comment:
    Typographical inconsistency: the comment uses "litellm" while the docstring refers to "LiteLLM". Please use consistent capitalization.
  • Reason this comment was not posted:
    Comment was on unchanged code.
5. skyvern/services/task_v2_service.py:17
  • Draft comment:
    It seems that line 17 contains just a stray ) which doesn't look intentional. Please verify if this is a typographical error or if additional code was intended on this line.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_zTAj8PHOc3w0zLCK

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copilot finished reviewing on behalf of AronPerez November 23, 2025 17:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for GitHub Copilot as an OpenAI-compatible LLM provider and implements task history truncation to prevent payload size errors when interacting with LLM APIs that have strict payload limits.

Key Changes

  • GitHub Copilot Integration: Added detection and handling for GitHub Copilot endpoints via OPENAI_COMPATIBLE configuration, including special header support (Copilot-Vision-Request: true) for vision-enabled requests
  • Payload Optimization: Implemented task history truncation (limiting entries and extracted data size) and reduced screenshot count to 3 to prevent "Request Entity Too Large" errors
  • Code Refactoring: Extracted duplicate token counting logic into reusable methods and consolidated image message detection into a shared utility function

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
skyvern/config.py Added OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN configuration setting for identifying GitHub Copilot endpoints
skyvern/utils/env_paths.py Simplified resolve_backend_env_path by removing unnecessary intermediate variable
skyvern/forge/sdk/api/llm/utils.py Introduced is_image_message() utility function to detect image messages in LLM requests
skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py Refactored to use shared is_image_message() utility instead of local implementation
skyvern/forge/sdk/api/llm/api_handler_factory.py Major changes: extracted _extract_token_counts() helper, added _get_model_label() and _get_check_model() helpers, implemented GitHub Copilot detection and custom header logic, updated cost calculation to skip OPENAI_COMPATIBLE providers
skyvern/services/task_v2_service.py Implemented _truncate_task_history_for_completion_check() and _truncate_extracted_data() functions, limited screenshots to 3, added streaming screenshot save functionality, applied truncation at key points to reduce LLM payload size

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

max_extracted_data_size: int = 1000,
) -> dict:
"""Truncate extracted_data field in a single entry to reduce payload size."""
entry_copy = entry.copy()
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _truncate_extracted_data function performs a shallow copy of the entry dictionary, which could be problematic if extracted_data contains nested mutable objects (like lists or dicts). If the entry is later modified elsewhere, the truncated version could inadvertently share mutable references with the original.

Consider using copy.deepcopy() or explicitly creating new nested structures to ensure complete isolation:

import copy

def _truncate_extracted_data(
    entry: dict,
    max_extracted_data_size: int = 1000,
) -> dict:
    """Truncate extracted_data field in a single entry to reduce payload size."""
    entry_copy = copy.deepcopy(entry)
    if "extracted_data" in entry_copy:
        data_str = str(entry_copy["extracted_data"])
        if len(data_str) > max_extracted_data_size:
            entry_copy["extracted_data"] = data_str[:max_extracted_data_size] + "... [truncated]"
    return entry_copy

This ensures that modifications to the truncated history won't affect the original task_history list.

Copilot uses AI. Check for mistakes.
current_url = current_url if current_url else str(await SkyvernFrame.get_url(frame=page) if page else url)

# Truncate task_history to reduce payload size (keep last 3 entries for context)
truncated_task_history = _truncate_task_history_for_completion_check(task_history, max_entries=3)
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The hardcoded value max_entries=3 for task history truncation during completion checks is lower than the default value of 5 used elsewhere (e.g., line 1787). This inconsistency could lead to confusion and maintenance issues.

Consider either:

  1. Using the same default value consistently across all invocations, or
  2. Documenting why different max_entries values are needed in different contexts (e.g., completion checks need fewer entries than summarization)

If the different values are intentional, consider extracting them as named constants to make the rationale clearer:

MAX_ENTRIES_FOR_COMPLETION_CHECK = 3
MAX_ENTRIES_FOR_SUMMARIZATION = 5

Copilot uses AI. Check for mistakes.
Comment on lines +607 to +611
if (
llm_key == "OPENAI_COMPATIBLE"
and settings.OPENAI_COMPATIBLE_API_BASE
and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE
):
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GitHub Copilot detection logic is duplicated in multiple places (lines 607-611 and lines 959-963, 1243-1246). This violates the DRY principle and makes maintenance harder - if the detection logic needs to change, it must be updated in all three places.

Consider extracting this into a helper method:

@staticmethod
def _is_github_copilot_endpoint() -> bool:
    """Check if the OPENAI_COMPATIBLE endpoint is GitHub Copilot."""
    return (
        settings.OPENAI_COMPATIBLE_API_BASE is not None
        and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE
    )

Then use it consistently:

if llm_key == "OPENAI_COMPATIBLE" and LLMAPIHandlerFactory._is_github_copilot_endpoint():
    # GitHub Copilot-specific logic

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to thisif people feel its needed

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
skyvern/forge/sdk/api/llm/api_handler_factory.py (1)

959-968: Same domain detection concern applies here.

The Copilot endpoint initialization correctly mirrors the OpenRouter pattern, but the domain detection logic has the same substring matching issue noted in the get_llm_api_handler review. If the URL parsing improvement is implemented there, it should be applied consistently here as well.

🧹 Nitpick comments (2)
skyvern/services/task_v2_service.py (1)

1664-1698: Consider deep copy for nested structures.

The truncation functions work well for their purpose, but entry.copy() at line 1669 creates a shallow copy. If extracted_data contains nested dictionaries or lists, modifications could affect the original entries.

Consider using copy.deepcopy() if nested structures are common:

+import copy
+
 def _truncate_extracted_data(
     entry: dict,
     max_extracted_data_size: int = 1000,
 ) -> dict:
     """Truncate extracted_data field in a single entry to reduce payload size."""
-    entry_copy = entry.copy()
+    entry_copy = copy.deepcopy(entry)
     if "extracted_data" in entry_copy:
         data_str = str(entry_copy["extracted_data"])
         if len(data_str) > max_extracted_data_size:
             entry_copy["extracted_data"] = data_str[:max_extracted_data_size] + "... [truncated]"
     return entry_copy
skyvern/forge/sdk/api/llm/api_handler_factory.py (1)

606-613: Consider more robust URL parsing for domain detection.

The current implementation uses a simple substring check (settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE). This could match unintended URLs like https://not-githubcopilot.com.attacker.com/api.

Consider using proper URL parsing for more precise domain matching:

+from urllib.parse import urlparse
+
 # For GitHub Copilot via OPENAI_COMPATIBLE, use LLMCaller for custom header support
+def _is_copilot_endpoint() -> bool:
+    if not settings.OPENAI_COMPATIBLE_API_BASE:
+        return False
+    try:
+        parsed = urlparse(settings.OPENAI_COMPATIBLE_API_BASE)
+        hostname = parsed.hostname or ""
+        return settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in hostname
+    except Exception:
+        return False
+
 if (
     llm_key == "OPENAI_COMPATIBLE"
-    and settings.OPENAI_COMPATIBLE_API_BASE
-    and settings.OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN in settings.OPENAI_COMPATIBLE_API_BASE
+    and _is_copilot_endpoint()
 ):
     llm_caller = LLMCaller(llm_key=llm_key, base_parameters=base_parameters)
     return llm_caller.call
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2608c02 and e3f9bbd.

📒 Files selected for processing (6)
  • skyvern/config.py (1 hunks)
  • skyvern/forge/sdk/api/llm/api_handler_factory.py (11 hunks)
  • skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py (3 hunks)
  • skyvern/forge/sdk/api/llm/utils.py (1 hunks)
  • skyvern/services/task_v2_service.py (6 hunks)
  • skyvern/utils/env_paths.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
skyvern/services/task_v2_service.py (2)
skyvern/forge/sdk/api/files.py (1)
  • get_skyvern_temp_dir (320-323)
skyvern/utils/prompt_engine.py (1)
  • load_prompt_with_elements (40-85)
skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py (1)
skyvern/forge/sdk/api/llm/utils.py (1)
  • is_image_message (18-24)
🪛 Ruff (0.14.5)
skyvern/services/task_v2_service.py

712-712: Do not catch blind exception: Exception

(BLE001)

skyvern/forge/sdk/api/llm/api_handler_factory.py

251-251: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL analysis (python)
  • GitHub Check: Agent
🔇 Additional comments (14)
skyvern/utils/env_paths.py (1)

13-13: LGTM! Clean simplification.

The removal of the intermediate env_path variable makes the code more concise without affecting functionality.

skyvern/services/task_v2_service.py (4)

4-4: LGTM! Necessary imports for screenshot streaming.

The Path and get_skyvern_temp_dir imports are properly used in the new screenshot persistence logic at lines 700-701.

Also applies to: 20-20


692-692: LGTM! Payload optimization for LLM API limits.

The max_screenshot_number=3 parameter appropriately limits screenshots to reduce payload size, which aligns with the PR's goal to support APIs with strict limits like GitHub Copilot.


697-718: LGTM! Appropriate error handling for non-critical operation.

The screenshot streaming logic properly:

  • Uses the temp directory utility for path construction
  • Creates directories safely with parents=True, exist_ok=True
  • Handles exceptions gracefully with warning logs

The broad exception catch at line 712 is appropriate here since screenshot streaming is a non-critical enhancement that should not fail the main task execution flow.


726-734: LGTM! Appropriate truncation strategy for different contexts.

The truncation is applied thoughtfully:

  • Line 727: More aggressive truncation (max_entries=3) for iteration prompts to keep payload lean during active execution
  • Line 1787: Less aggressive truncation (max_entries=5) for final summarization where more context is beneficial
  • Line 995: Correctly reuses the truncated history from the completion check context

This differentiated approach balances payload size with context quality based on the use case.

Also applies to: 1786-1791, 995-995

skyvern/config.py (1)

204-204: LGTM! Clear configuration for Copilot domain detection.

The new OPENAI_COMPATIBLE_GITHUB_COPILOT_DOMAIN setting follows the existing naming convention and provides a clean way to identify GitHub Copilot endpoints. The default value is appropriate for domain matching.

skyvern/forge/sdk/api/llm/utils.py (1)

18-24: LGTM! Clean utility for image message detection.

The is_image_message function provides a centralized, reusable way to detect image-containing messages. The implementation uses appropriate defensive checks and follows the OpenAI message format structure.

skyvern/forge/sdk/api/llm/ui_tars_llm_caller.py (1)

24-24: LGTM! Good refactoring to eliminate duplication.

Replacing the local _is_image_message implementation with the shared is_image_message utility from utils.py reduces code duplication and centralizes image message detection logic.

Also applies to: 108-108, 125-125

skyvern/forge/sdk/api/llm/api_handler_factory.py (6)

29-34: LGTM! Proper import for image detection utility.

The is_image_message import is appropriately added alongside existing utilities and is used correctly in the Copilot header logic at line 1248.


81-107: LGTM! Excellent refactoring to centralize token extraction.

The _extract_token_counts helper consolidates token parsing logic that was previously duplicated across multiple locations. The implementation:

  • Uses defensive programming with getattr and default values
  • Handles multiple token detail formats (completion_tokens_details, prompt_tokens_details)
  • Provides fallback for cached tokens (cache_read_input_tokens)
  • Returns a clear tuple of 4 values

This significantly improves maintainability.


235-253: LGTM! Defensive fallback handling is appropriate.

The _get_model_label and _get_check_model helpers properly centralize model name extraction for capability checks. The broad exception catch at line 251 is intentional and appropriate - it provides a safe fallback when extracting the model name from router configurations, ensuring the code doesn't fail on edge cases during capability detection.


496-498: LGTM! Consistent use of centralized token extraction.

The _extract_token_counts helper is used consistently in both handler paths, properly unpacking the tuple into the four token count variables. This eliminates code duplication and improves maintainability.

Also applies to: 815-817


1392-1404: LGTM! Proper handling of OPENAI_COMPATIBLE token tracking.

The changes correctly:

  • Extract token counts for OPENAI_COMPATIBLE providers even though cost calculation is deferred
  • Use the centralized _extract_token_counts helper consistently
  • Document the pending cost calculation with a TODO and link to pricing information

The token tracking will provide valuable metrics while cost calculation can be implemented later.

Also applies to: 1436-1438


1242-1252: Header name verified as correct.

The header name Copilot-Vision-Request is confirmed correct per GitHub Copilot API documentation. The HTTP header is case-insensitive, and its value should be set to "true" to enable vision/image support.

The code correctly implements this by setting the header when images are detected. The domain detection substring matching concern noted in the comment remains a valid consideration from earlier review feedback but is separate from this header name verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant