Skip to content

[NTE] BondGift && Some feature#345

Draft
HuiExcalibur wants to merge 7 commits into
1bananachicken:devfrom
HuiExcalibur:feat/wrap-and-i18n-ocr
Draft

[NTE] BondGift && Some feature#345
HuiExcalibur wants to merge 7 commits into
1bananachicken:devfrom
HuiExcalibur:feat/wrap-and-i18n-ocr

Conversation

@HuiExcalibur

@HuiExcalibur HuiExcalibur commented Jun 18, 2026

Copy link
Copy Markdown

Pull Request

提交前请阅读 PR 规范

关联 Issue

Related #290

变更摘要

增加自动送礼功能
多语言OCR,而不是将所有多语言写在 json 的 expected 中
增强action注册,自动解析 json 内容并映射到 dataclass

验证

还在开发中

截图 / 日志 / 说明

Summary by Sourcery

引入可配置自定义动作、多语言 OCR,以及共享任务级状态的基础设施,以支持新的 BondGift 及相关自动化功能。

新功能:

  • 添加 task_action 装饰器,用于注册自定义动作,并自动处理 JSON 到 dataclass 的配置解析、重试机制以及统一的结果规范化。
  • 引入 TaskSession 容器,用于存放每个任务的、按类型隔离的状态,并在相关动作之间共享。
  • 添加 DynamicOCR 自定义识别功能,从集中管理的 OCR 国际化数据中读取与语言相关的预期文本,并按对应语言执行 OCR。
  • 将多语言 OCR 的预期文本集中到共享的 I18N_OCR 映射中,并基于当前客户端语言提供便捷的查找辅助方法。
  • 添加 BondGift 滚动列表工具,用于在可滚动界面中执行滚动、模板匹配以及列表末尾检测。
  • 引入资源覆盖工具,允许用户提供的资源透明地覆盖内置资源。
  • 添加通用 Singleton 基类,并在包级别设置 wrap 命名空间,用于对外暴露新的辅助工具。
Original summary in English

Summary by Sourcery

Introduce infrastructure for configurable custom actions, multi-language OCR, and shared task-level state to support new BondGift and related automation features.

New Features:

  • Add a task_action decorator to register custom actions with automatic JSON-to-dataclass config parsing, retry handling, and unified result normalization.
  • Introduce a TaskSession container to hold per-task, type-isolated state shared across related actions.
  • Add a DynamicOCR custom recognition that reads language-specific expected texts from centralized OCR i18n data and runs OCR accordingly.
  • Centralize multi-language OCR expected texts in a shared I18N_OCR mapping with helper lookup based on current client language.
  • Add a BondGift scroll list utility for scrolling, template matching, and end-of-list detection in scrollable UIs.
  • Introduce a resource overlay utility to let user-provided resources transparently override built-in assets.
  • Add a generic Singleton base and wire up a package-level wrap namespace for exposing new helpers.

@sourcery-ai

sourcery-ai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

引入了一个可复用的包装层用于自定义动作(custom actions)和任务会话(task sessions),增加了多语言 OCR 和资源覆盖(resource overlay)的基础设施,并实现了共享的滚动与识别工具,以支持新的 BondGift 及相关自动化特性。

使用 task_action 包装的 CustomAction 带重试执行的时序图

sequenceDiagram
    autonumber
    actor Pipeline
    participant AgentServer
    participant MyAction as CustomActionSubclass
    participant CfgParser as _parse_config
    participant ResultNorm as _normalize_result

    Pipeline->>AgentServer: invoke custom_action name
    AgentServer->>MyAction: run(context, argv)
    loop [attempt 0..retries]
        MyAction->>CfgParser: _parse_config(argv.custom_action_param, config_cls)
        CfgParser-->>MyAction: config
        MyAction->>MyAction: original_run(context, config)
        alt [original_run returns]
            break [exit loop]
                MyAction->>ResultNorm: _normalize_result(result)
                ResultNorm-->>MyAction: RunResult
                MyAction-->>AgentServer: RunResult
                AgentServer-->>Pipeline: RunResult
            end
        else [exception raised]
            MyAction->>MyAction: log warning and optional delay
        end
    end
    opt [all attempts failed]
        MyAction-->>AgentServer: RunResult(success=False)
        AgentServer-->>Pipeline: RunResult(success=False)
    end
Loading

DynamicOCR 多语言 OCR 分析流程时序图

sequenceDiagram
    autonumber
    actor Pipeline
    participant AgentServer
    participant DynamicOCR
    participant TLFunc as TL
    participant Context

    Pipeline->>AgentServer: invoke custom_recognition "dynamic_ocr"
    AgentServer->>DynamicOCR: analyze(context, argv)
    DynamicOCR->>DynamicOCR: _parse_param(argv.custom_recognition_param)
    DynamicOCR->>TLFunc: TL(node, lang)
    TLFunc-->>DynamicOCR: expected_text or None
    alt [expected_text is None]
        DynamicOCR-->>AgentServer: AnalyzeResult(box=None, detail={})
        AgentServer-->>Pipeline: AnalyzeResult
    else [expected_text found]
        DynamicOCR->>Context: run_recognition_direct(JRecognitionType.OCR, JOCR(roi, expected, threshold), argv.image)
        Context-->>DynamicOCR: result
        alt [result.hit and result.box]
            DynamicOCR-->>AgentServer: AnalyzeResult(box, detail={node, expected})
            AgentServer-->>Pipeline: AnalyzeResult
        else [no hit]
            DynamicOCR-->>AgentServer: AnalyzeResult(box=None, detail={})
            AgentServer-->>Pipeline: AnalyzeResult
        end
    end
Loading

File-Level Changes

Change Details Files
Added a task_action decorator to enhance CustomAction implementations with typed config parsing, retries, and unified result handling, and exposed it via a new wrap package.
  • Implemented _parse_config to convert custom_action_param into a typed dataclass instance with default handling and JSON parsing.
  • Wrapped CustomAction.run with retry logic on exceptions, including backoff delays and structured logging, and normalized various return types into CustomAction.RunResult.
  • Registered decorated actions with AgentServer.custom_action while preserving the original inheritance and pipeline JSON format.
  • Re-exported task_action (with a safe ImportError fallback) and TaskSession from a new agent.wrap package entrypoint.
agent/wrap/task_action.py
agent/wrap/__init__.py

翻译:

变更 详情 文件
添加了一个 task_action 装饰器,以通过类型化配置解析、重试机制和统一结果处理来增强 CustomAction 的实现,并通过新的 wrap 包对外暴露。
  • 实现了 _parse_config,将 custom_action_param 转换为带默认值处理和 JSON 解析的类型化 dataclass 实例。
  • 用异常重试逻辑包装了 CustomAction.run,包括退避(backoff)延迟和结构化日志,并将多种返回类型归一化为 CustomAction.RunResult
  • 通过 AgentServer.custom_action 注册被装饰的 action,同时保持原有继承关系和流水线(pipeline) JSON 格式不变。
  • 从新的 agent.wrap 包入口重新导出 task_action(带安全的 ImportError 回退)和 TaskSession
agent/wrap/task_action.py
agent/wrap/__init__.py

Change Details Files
Introduced TaskSession as a per-task, type-isolated state container to replace ad-hoc module-level globals.
  • Stored state instances keyed by (tasker id, state type) with a global dictionary protected by a threading.Lock for thread safety.
  • Provided TaskSession.of to lazily create and retrieve typed state instances tied to the current Context.tasker.
  • Added cleanup utilities to clear state per tasker, either for a specific state type or for all types, plus a debug-only _size helper.
agent/wrap/task_session.py

翻译:

变更 详情 文件
引入了 TaskSession,作为按任务划分且按类型隔离的状态容器,用于替代临时的模块级全局变量。
  • 使用由 (tasker id, state type) 作为键的全局字典存储状态实例,并通过 threading.Lock 保护以确保线程安全。
  • 提供 TaskSession.of,惰性创建和获取与当前 Context.tasker 绑定的类型化状态实例。
  • 增加了清理工具,可按 tasker 清理状态:针对特定状态类型或所有类型,并提供仅用于调试的 _size 辅助函数。
agent/wrap/task_session.py

Change Details Files
Centralized multi-language OCR expected text into a single data module and added a dynamic custom recognition that consumes it.
  • Defined I18N_OCR as a mapping from logical node keys to per-language OCR strings/regexes and TL helper to resolve by current or explicit language with a zh_cn fallback.
  • Implemented _current_lang to derive the client language from utils.pienv, defaulting safely when unavailable.
  • Created DynamicOCR CustomRecognition that parses custom_recognition_param, resolves expected text via TL, and runs OCR with JOCR/JRecognitionType, returning hit boxes and details.
  • Exposed DynamicOCR from the custom recognition package for automatic discovery/registration.
agent/utils/i18n_ocr_data.py
agent/custom/recognition/dynamic_ocr.py
agent/custom/recognition/__init__.py

翻译:

变更 详情 文件
将多语言 OCR 期望文本集中到单一数据模块中,并添加了一个使用该数据的动态自定义识别。
  • 定义 I18N_OCR,作为从逻辑节点键到按语言划分的 OCR 文本/正则的映射,并提供 TL 辅助函数,基于当前或显式语言解析,带 zh_cn 回退。
  • 实现 _current_lang,从 utils.pienv 推导客户端语言,在不可用时安全地使用默认值。
  • 创建 DynamicOCR 自定义识别(CustomRecognition),解析 custom_recognition_param,通过 TL 解析期望文本,并使用 JOCR/JRecognitionType 运行 OCR,返回命中框和详情。
  • 从自定义识别包中导出 DynamicOCR,以便自动发现/注册。
agent/utils/i18n_ocr_data.py
agent/custom/recognition/dynamic_ocr.py
agent/custom/recognition/__init__.py

Change Details Files
Added a generic scrollable-list utility to support BondGift and similar features with swipe-based scrolling and template matching.
  • Implemented swipe helpers that construct pipeline overrides for vertical Swipe actions and invoke them via context.run_task.
  • Added screenshot utilities (snap_roi) that use the controller screencap pipeline and slice configured ROIs into numpy arrays.
  • Implemented a visual is_stuck detector using mean absolute pixel differences to detect when the list no longer scrolls.
  • Implemented find_in_list to loop over scrolls, perform cv2.matchTemplate in a given ROI, and stop on match, end-of-list, or iteration cap, returning global coordinates when successful.
agent/custom/action/BondGift/scroll_list.py

翻译:

变更 详情 文件
添加了一个通用的可滚动列表工具,用于支持 BondGift 及类似特性,通过滑动滚动和模板匹配实现。
  • 实现了滑动辅助函数,用于构造纵向 Swipe 动作的流水线覆盖(pipeline overrides),并通过 context.run_task 调用。
  • 增加了截图工具(snap_roi),使用控制器的截屏流水线(screencap pipeline),并将配置的 ROI 切片为 numpy 数组。
  • 实现视觉上的 is_stuck 检测器,通过平均绝对像素差判断列表是否已停止滚动。
  • 实现 find_in_list,循环执行滚动,在给定 ROI 中进行 cv2.matchTemplate,并在匹配成功、到达列表末尾或迭代次数上限时停止,成功时返回全局坐标。
agent/custom/action/BondGift/scroll_list.py

Change Details Files
Introduced a resource overlay mechanism and a minimal singleton utility to support user-configurable resource overrides.
  • Defined a Singleton base/metaclass pair to provide simple, process-wide singletons.
  • Added ResourceOverlay, which resolves asset paths by preferring config/resource over assets/resource/base and provides a user_path helper that ensures parent directories exist.
  • Instantiated a module-level res object for convenient access to overlay resolution throughout the agent.
agent/utils/singleton.py
agent/utils/resource_overlay.py

翻译:

变更 详情 文件
引入资源覆盖机制以及一个精简的单例工具,以支持用户可配置的资源覆盖。
  • 定义了 Singleton 基类/元类组合,用于提供简单的、进程级的单例实现。
  • 添加 ResourceOverlay,在解析资源路径时优先使用 config/resource 而不是 assets/resource/base,并提供 user_path 辅助函数以确保父目录存在。
  • 实例化了模块级的 res 对象,方便在整个 agent 中访问覆盖后的资源解析。
agent/utils/singleton.py
agent/utils/resource_overlay.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: 在 pull request 上评论 @sourcery-ai review 以触发新的 review。
  • Continue discussions: 直接回复 Sourcery 的 review 评论以继续讨论。
  • Generate a GitHub issue from a review comment: 通过回复某条 review 评论,请求 Sourcery 从该评论创建一个 issue。你也可以在某条 review 评论下回复 @sourcery-ai issue 以从中创建 issue。
  • Generate a pull request title: 在 pull request 标题中的任意位置写上 @sourcery-ai,即可在任意时刻生成标题。你也可以在 pull request 上评论 @sourcery-ai title 来(重新)生成标题。
  • Generate a pull request summary: 在 pull request 正文任意位置写上 @sourcery-ai summary,即可在任意时刻在对应位置生成 PR 摘要。你也可以在 pull request 上评论 @sourcery-ai summary 来(重新)生成摘要。
  • Generate reviewer's guide: 在 pull request 上评论 @sourcery-ai guide 以在任意时刻(重新)生成审阅指南。
  • Resolve all Sourcery comments: 在 pull request 上评论 @sourcery-ai resolve 以标记已解决所有 Sourcery 评论。当你已经处理完所有评论并且不再想看到它们时很有用。
  • Dismiss all Sourcery reviews: 在 pull request 上评论 @sourcery-ai dismiss 以忽略所有现有的 Sourcery reviews。尤其适用于你想从一个新的 review 开始——不要忘了再评论 @sourcery-ai review 来触发新的 review!

Customizing Your Experience

访问你的 控制面板 以:

  • 启用或禁用诸如 Sourcery 生成的 pull request 摘要、审阅指南等 review 功能。
  • 更改审阅语言。
  • 添加、移除或编辑自定义审阅指令。
  • 调整其他审阅设置。

Getting Help

Original review guide in English

Reviewer's Guide

Introduces a reusable wrapping layer for custom actions and task sessions, adds infrastructure for multi-language OCR and resource overlays, and implements shared scrolling and recognition utilities to support new BondGift and related automation features.

Sequence diagram for task_action-wrapped CustomAction execution with retries

sequenceDiagram
    autonumber
    actor Pipeline
    participant AgentServer
    participant MyAction as CustomActionSubclass
    participant CfgParser as _parse_config
    participant ResultNorm as _normalize_result

    Pipeline->>AgentServer: invoke custom_action name
    AgentServer->>MyAction: run(context, argv)
    loop [attempt 0..retries]
        MyAction->>CfgParser: _parse_config(argv.custom_action_param, config_cls)
        CfgParser-->>MyAction: config
        MyAction->>MyAction: original_run(context, config)
        alt [original_run returns]
            break [exit loop]
                MyAction->>ResultNorm: _normalize_result(result)
                ResultNorm-->>MyAction: RunResult
                MyAction-->>AgentServer: RunResult
                AgentServer-->>Pipeline: RunResult
            end
        else [exception raised]
            MyAction->>MyAction: log warning and optional delay
        end
    end
    opt [all attempts failed]
        MyAction-->>AgentServer: RunResult(success=False)
        AgentServer-->>Pipeline: RunResult(success=False)
    end
Loading

Sequence diagram for DynamicOCR multi-language OCR analyze flow

sequenceDiagram
    autonumber
    actor Pipeline
    participant AgentServer
    participant DynamicOCR
    participant TLFunc as TL
    participant Context

    Pipeline->>AgentServer: invoke custom_recognition "dynamic_ocr"
    AgentServer->>DynamicOCR: analyze(context, argv)
    DynamicOCR->>DynamicOCR: _parse_param(argv.custom_recognition_param)
    DynamicOCR->>TLFunc: TL(node, lang)
    TLFunc-->>DynamicOCR: expected_text or None
    alt [expected_text is None]
        DynamicOCR-->>AgentServer: AnalyzeResult(box=None, detail={})
        AgentServer-->>Pipeline: AnalyzeResult
    else [expected_text found]
        DynamicOCR->>Context: run_recognition_direct(JRecognitionType.OCR, JOCR(roi, expected, threshold), argv.image)
        Context-->>DynamicOCR: result
        alt [result.hit and result.box]
            DynamicOCR-->>AgentServer: AnalyzeResult(box, detail={node, expected})
            AgentServer-->>Pipeline: AnalyzeResult
        else [no hit]
            DynamicOCR-->>AgentServer: AnalyzeResult(box=None, detail={})
            AgentServer-->>Pipeline: AnalyzeResult
        end
    end
Loading

File-Level Changes

Change Details Files
Added a task_action decorator to enhance CustomAction implementations with typed config parsing, retries, and unified result handling, and exposed it via a new wrap package.
  • Implemented _parse_config to convert custom_action_param into a typed dataclass instance with default handling and JSON parsing.
  • Wrapped CustomAction.run with retry logic on exceptions, including backoff delays and structured logging, and normalized various return types into CustomAction.RunResult.
  • Registered decorated actions with AgentServer.custom_action while preserving the original inheritance and pipeline JSON format.
  • Re-exported task_action (with a safe ImportError fallback) and TaskSession from a new agent.wrap package entrypoint.
agent/wrap/task_action.py
agent/wrap/__init__.py
Introduced TaskSession as a per-task, type-isolated state container to replace ad-hoc module-level globals.
  • Stored state instances keyed by (tasker id, state type) with a global dictionary protected by a threading.Lock for thread safety.
  • Provided TaskSession.of to lazily create and retrieve typed state instances tied to the current Context.tasker.
  • Added cleanup utilities to clear state per tasker, either for a specific state type or for all types, plus a debug-only _size helper.
agent/wrap/task_session.py
Centralized multi-language OCR expected text into a single data module and added a dynamic custom recognition that consumes it.
  • Defined I18N_OCR as a mapping from logical node keys to per-language OCR strings/regexes and TL helper to resolve by current or explicit language with a zh_cn fallback.
  • Implemented _current_lang to derive the client language from utils.pienv, defaulting safely when unavailable.
  • Created DynamicOCR CustomRecognition that parses custom_recognition_param, resolves expected text via TL, and runs OCR with JOCR/JRecognitionType, returning hit boxes and details.
  • Exposed DynamicOCR from the custom recognition package for automatic discovery/registration.
agent/utils/i18n_ocr_data.py
agent/custom/recognition/dynamic_ocr.py
agent/custom/recognition/__init__.py
Added a generic scrollable-list utility to support BondGift and similar features with swipe-based scrolling and template matching.
  • Implemented swipe helpers that construct pipeline overrides for vertical Swipe actions and invoke them via context.run_task.
  • Added screenshot utilities (snap_roi) that use the controller screencap pipeline and slice configured ROIs into numpy arrays.
  • Implemented a visual is_stuck detector using mean absolute pixel differences to detect when the list no longer scrolls.
  • Implemented find_in_list to loop over scrolls, perform cv2.matchTemplate in a given ROI, and stop on match, end-of-list, or iteration cap, returning global coordinates when successful.
agent/custom/action/BondGift/scroll_list.py
Introduced a resource overlay mechanism and a minimal singleton utility to support user-configurable resource overrides.
  • Defined a Singleton base/metaclass pair to provide simple, process-wide singletons.
  • Added ResourceOverlay, which resolves asset paths by preferring config/resource over assets/resource/base and provides a user_path helper that ensures parent directories exist.
  • Instantiated a module-level res object for convenient access to overlay resolution throughout the agent.
agent/utils/singleton.py
agent/utils/resource_overlay.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Hollow-YK

Copy link
Copy Markdown
Collaborator

可以来开发群1092630280讨论相关内容

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants