Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix task v2 data schema #1967

Merged
merged 1 commit into from
Mar 18, 2025
Merged

Conversation

wintonzheng
Copy link
Contributor

@wintonzheng wintonzheng commented Mar 18, 2025

Important

Enhance TaskV2 by adding extracted_information_schema and error_code_mapping support, and fix formatting in task_v2_summary.j2.

  • Behavior:
    • Add extracted_information_schema and error_code_mapping parameters to create_task_v2() in client.py.
    • Update run_task_v2() in agent_protocol.py to pass extracted_information_schema and error_code_mapping.
    • Modify initialize_task_v2() in task_v2_service.py to handle extracted_information_schema and error_code_mapping.
  • Template:
    • Fix formatting of extracted_information_schema in task_v2_summary.j2.

This description was created by Ellipsis for 686a3df. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 49c4dac in 2 minutes and 40 seconds

More details
  • Looked at 73 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 10 drafted comments based on config settings.
1. skyvern/forge/sdk/services/task_v2_service.py:1437
  • Draft comment:
    Use a context manager for httpx.AsyncClient() to ensure proper connection closure.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern/forge/sdk/services/task_v2_service.py:1197
  • Draft comment:
    Avoid reseeding the random generator on every call. Consider seeding once at module load.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. skyvern/forge/sdk/services/task_v2_service.py:107
  • Draft comment:
    Consider serializing 'extracted_information_schema' (e.g. with json.dumps) before passing it to create_task_v2. This will ensure the schema is consistently formatted when later used in prompt templates.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
    The comment makes a reasonable suggestion about consistent serialization, but there are several issues: 1. The type hint already allows str, so serialization may not be required 2. We don't see the implementation of create_task_v2() to know if it needs serialized input 3. We don't see where this schema is used in prompt templates to verify the claim 4. The comment is speculative rather than pointing to a concrete issue 5. Without more context, this could introduce bugs if create_task_v2() expects a dict/list
    I may be overly cautious - consistent serialization is generally a good practice and could prevent subtle bugs. The comment provides a concrete suggestion with code.
    While consistent serialization is good, making this change requires more evidence that it's necessary and won't cause issues. The comment is too speculative without showing actual problems.
    Delete the comment. While the suggestion is reasonable, we don't have enough evidence that serialization is needed or safe to make this change. The comment is too speculative.
4. skyvern/forge/sdk/services/task_v2_service.py:1398
  • Draft comment:
    When loading the task_v2_summary prompt, 'extracted_information_schema' is passed directly. Ensure it is converted to a string (using json.dumps if needed) so that the template renders it correctly.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. skyvern/forge/sdk/services/task_v2_service.py:1102
  • Draft comment:
    In _generate_extraction_task, the 'data_schema' is extracted from the LLM response without validation. Consider adding type checks or normalization to ensure the schema matches the expected structure.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. skyvern/forge/sdk/services/task_v2_service.py:100
  • Draft comment:
    The union type for 'extracted_information_schema' (dict | list | str | None) can lead to ambiguity in handling. Consider standardizing its type for clearer usage throughout the task v2 workflow.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
    While the comment raises a valid point about type ambiguity, there are several issues with it: 1) It's not clear what the "correct" type should be 2) The flexibility may be intentional to handle different schema formats 3) The comment doesn't provide concrete evidence of problems caused by this ambiguity 4) It's more of a suggestion than a clear issue requiring change 5) Without understanding the full context of how schemas are used, standardizing could break existing functionality.
    The comment identifies a potential code quality issue around type safety. Having multiple possible types could make error handling and validation more complex.
    While type safety is important, the comment doesn't demonstrate actual problems caused by the current implementation. The flexibility may be a feature rather than a bug, and changing it would require deeper understanding of the schema requirements.
    Delete the comment as it's not actionable enough and doesn't demonstrate clear problems with the current implementation. Type standardization should be driven by concrete requirements rather than general suggestions.
7. skyvern/forge/prompts/skyvern/task_v2_summary.j2:1
  • Draft comment:
    Consider changing 'in the web' to 'on the web' for better clarity and grammatical correctness.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. skyvern/forge/sdk/services/task_v2_service.py:77
  • Draft comment:
    Typo detected: In the description, 'extrat price information' should be 'extract price information'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. skyvern/forge/sdk/services/task_v2_service.py:187
  • Draft comment:
    Typo in comment: 'update oserver cruise' should likely be 'update observer cruise'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. skyvern/forge/sdk/services/task_v2_service.py:907
  • Draft comment:
    Typo in comment: 'wofklow run' should be corrected to 'workflow run'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_OS40IdsYEugvbbFM


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@wintonzheng wintonzheng force-pushed the shu/task_v2_data_schema_quick_fix branch from 49c4dac to 686a3df Compare March 18, 2025 22:34
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 686a3df in 1 minute and 20 seconds

More details
  • Looked at 73 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 12 drafted comments based on config settings.
1. skyvern/forge/sdk/services/task_v2_service.py:1196
  • Draft comment:
    Avoid reseeding the global random state with os.urandom in _generate_random_string; use secrets.token_hex or SystemRandom instead to generate secure random strings.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern/forge/sdk/services/task_v2_service.py:1457
  • Draft comment:
    Use an async context manager for httpx.AsyncClient in send_task_v2_webhook to avoid resource leaks.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. skyvern/forge/sdk/services/task_v2_service.py:1450
  • Draft comment:
    Ensure consistent error handling in webhook sending; consider logging errors at an appropriate level and reusing a client instance if possible.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. skyvern/forge/sdk/services/task_v2_service.py:1430
  • Draft comment:
    Consider using an async context manager for the httpx client in send_task_v2_webhook (L1457-1460) to ensure proper resource cleanup.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. skyvern/forge/sdk/services/task_v2_service.py:1196
  • Draft comment:
    Using random.seed(os.urandom(16)) on every call in _generate_random_string may be overkill and could affect randomness. Consider using the secrets module for secure random strings.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. skyvern/forge/sdk/services/task_v2_service.py:1266
  • Draft comment:
    When calculating task duration, ensure that task_v2.created_at already has correct timezone info before replacing tzinfo with UTC to avoid potential errors.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. skyvern/forge/sdk/services/task_v2_service.py:1330
  • Draft comment:
    The nested type checks in _get_extracted_data_from_block_result (L1330-1375) are complex. Consider refactoring for clarity and ensuring consistency across all output structures.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. skyvern/forge/sdk/services/task_v2_service.py:1399
  • Draft comment:
    In _summarize_task_v2, the extracted_information_schema from task_v2 is passed to the summary prompt. Validate its format before use to avoid unexpected prompt issues.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. skyvern/forge/sdk/services/task_v2_service.py:187
  • Draft comment:
    Typo: In the comment, "update oserver cruise" should likely be "update observer cruise".
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. skyvern/forge/sdk/services/task_v2_service.py:77
  • Draft comment:
    Typo: In the description for 'loop_values_key', 'extrat price information' should be 'extract price information'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. skyvern/forge/sdk/services/task_v2_service.py:1339
  • Draft comment:
    Typo: The comment mentions 'output_paremeter_value'; it should be 'output_parameter_value'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. skyvern/forge/sdk/services/task_v2_service.py:908
  • Draft comment:
    Typo: The comment reads 'wofklow run' which should be 'workflow run'.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_dKBvDlLZQU7WfyHP


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@wintonzheng wintonzheng merged commit be8f3eb into main Mar 18, 2025
6 of 7 checks passed
@wintonzheng wintonzheng deleted the shu/task_v2_data_schema_quick_fix branch March 18, 2025 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant