Skip to content

feat: 增加对口型功能/增加保存和回放弹幕#23

Merged
ChangingSelf merged 7 commits into
Mai-with-u:devfrom
tcmofashi:dev
Jun 18, 2025
Merged

feat: 增加对口型功能/增加保存和回放弹幕#23
ChangingSelf merged 7 commits into
Mai-with-u:devfrom
tcmofashi:dev

Conversation

@tcmofashi
Copy link
Copy Markdown
Collaborator

@tcmofashi tcmofashi commented Jun 16, 2025

好的,这是翻译成中文的 pull request 总结:

Sourcery 总结

为 Bilibili 弹幕插件添加基于文件的保存和回放功能,以及跳过初始弹幕和纯文件回放模式;在 VTube Studio 插件中实现实时唇形同步,通过音频分析并将唇形同步的启动/停止集成到 TTS 和 GPT-SoVITS 插件中。

新功能:

  • 将接收到的 Bilibili 弹幕保存到 JSONL 文件并在以后回放,包括纯离线回放模式
  • 可选跳过初始批次的弹幕,以便只处理新消息
  • 在 VTube Studio 插件中添加实时唇形同步,包括元音检测、平滑和播放时间对齐
  • 将唇形同步的启动/停止和流式音频分析集成到 TTS 和 GPT-SoVITS 插件中,以驱动 VTube Studio 的嘴部动画

增强功能:

  • 通过更严格的类型检查、默认值和显式配置加载来集中和验证插件配置
  • 为弹幕文件创建一个专用数据目录,并使 WebDriver 设置适应仅文件模式
  • 增强跨插件的日志记录,以显示文件模式、初始加载状态和音频分析步骤

文档:

  • 更新 bili_danmaku_selenium 和 vtube_studio 的 README 文件,以涵盖新的保存/加载设置和唇形同步参数
  • 为 VTube Studio 插件添加 requirements.txt,以声明音频分析依赖项
  • 修订 config-template.toml 文件,以包含唇形同步和弹幕文件处理选项
Original summary in English

Summary by Sourcery

Add file-based saving and replay for Bilibili danmaku plugin, plus skip-initial-danmaku and pure file replay modes; implement real-time lip-sync in VTube Studio plugin with audio analysis and integrate lip-sync start/stop into TTS and GPT-SoVITS plugins.

New Features:

  • Save incoming Bilibili danmaku to JSONL files and replay them later, including a pure offline replay mode
  • Option to skip initial batch of danmaku so only new messages are processed
  • Add real-time lip synchronization in VTube Studio plugin with vowel detection, smoothing, and playback-time alignment
  • Integrate lip-sync start/stop and streaming audio analysis into TTS and GPT-SoVITS plugins to drive VTube Studio mouth animations

Enhancements:

  • Centralize and validate plugin configuration with stricter type checks, defaults, and explicit config loading
  • Create a dedicated data directory for danmaku files and adapt WebDriver setup to file-only mode
  • Enhance logging across plugins to surface file modes, initial-load status, and audio analysis steps

Documentation:

  • Update README files for bili_danmaku_selenium and vtube_studio to cover new saving/loading settings and lip-sync parameters
  • Add requirements.txt for VTube Studio plugin to declare audio-analysis dependencies
  • Revise config-template.toml files to include lip-sync and danmaku file handling options

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Jun 16, 2025

## Reviewer's Guide (审阅者指南)

This PR refactors and extends plugins to support lip-sync across VTube Studio and TTS pipelines, and enriches the BiliDanmakuSeleniumPlugin with file-based danmaku saving and replay, improved configuration handling, and robust WebDriver management.

此 PR 重构并扩展了插件,以支持 VTube Studio 和 TTS 管道之间的唇形同步,并增强了 BiliDanmakuSeleniumPlugin,使其具有基于文件的弹幕保存和重放、改进的配置处理以及强大的 WebDriver 管理。

#### Sequence Diagram: Lip Sync Audio Processing for VTube Studio (时序图:VTube Studio 的唇形同步音频处理)

```mermaid
sequenceDiagram
    participant User as End User
    participant TTSPlugin as TTS Plugin (e.g., EdgeTTS, GPT-SoVITS)
    participant AmaidesuCore as Amaidesu Core
    participant VTubeStudioPlugin as VTubeStudio Plugin
    participant VTSApp as VTube Studio Application

    User->>AmaidesuCore: Initiate Action (e.g., send message)
    AmaidesuCore->>TTSPlugin: speak(text_to_synthesize)
    TTSPlugin->>AmaidesuCore: Get Service ("vts_lip_sync")
    AmaidesuCore-->>TTSPlugin: vts_lip_sync_service (VTubeStudioPlugin)
    TTSPlugin->>VTubeStudioPlugin: start_lip_sync_session(text_to_synthesize)
    loop Audio Stream Processing
        TTSPlugin->>TTSPlugin: Generate audio_chunk
        TTSPlugin->>VTubeStudioPlugin: process_tts_audio(audio_chunk, sample_rate)
        VTubeStudioPlugin->>VTubeStudioPlugin: analyze_audio_chunk(audio_chunk)
        VTubeStudioPlugin->>VTSApp: Set Lip Sync Parameters (VoiceVolume, MouthOpen, VoiceA, etc.)
    end
    TTSPlugin->>VTubeStudioPlugin: stop_lip_sync_session()

Sequence Diagram: Danmaku Replay from File in BiliDanmakuSeleniumPlugin (时序图:BiliDanmakuSeleniumPlugin 中从文件重放弹幕)

sequenceDiagram
    participant BiliDanmakuSeleniumPlugin as Plugin
    participant Filesystem
    participant AmaidesuCore

    Note over Plugin: In file_only_mode
    Plugin->>Plugin: setup() called
    Plugin->>Plugin: _load_danmaku_from_file()
    Plugin->>Filesystem: Read danmaku_load_file (e.g., danmaku_ROOMID.jsonl)
    Filesystem-->>Plugin: JSONL data lines
    loop For each line in JSONL data
        Plugin->>Plugin: Parse JSON to MessageBase object
        Plugin->>Plugin: Add MessageBase to loaded_danmaku_queue
    end
    Plugin->>Plugin: _run_file_replay_loop() starts
    loop For each message_base in loaded_danmaku_queue
        Plugin->>Plugin: Calculate wait_time (based on message_base.message_info.time)
        Plugin->>Plugin: await asyncio.wait_for(stop_event, timeout=wait_time)
        Plugin->>Plugin: message_cache_service.cache_message(message_base)
        Plugin->>AmaidesuCore: send_to_maicore(message_base)
    end
Loading

Sequence Diagram: Live Danmaku Saving in BiliDanmakuSeleniumPlugin (时序图:BiliDanmakuSeleniumPlugin 中保存实时弹幕)

sequenceDiagram
    participant BiliDanmakuSeleniumPlugin as Plugin
    participant WebDriver
    participant Filesystem
    participant AmaidesuCore

    Note over Plugin: Live danmaku monitoring
    Plugin->>WebDriver: Fetch raw danmaku elements from Bilibili page
    WebDriver-->>Plugin: Raw danmaku elements
    Plugin->>Plugin: Parse elements into DanmakuMessage objects
    loop For each DanmakuMessage
        Plugin->>Plugin: _create_message_base(danmaku_message)
        Plugin->>Plugin: message_base created
        Plugin->>Plugin: message_cache_service.cache_message(message_base)
        opt enable_danmaku_save is true
            Plugin->>Plugin: _save_danmaku_to_file(message_base)
            Plugin->>Filesystem: Write MessageBase as JSON to danmaku_save_file
        end
        Plugin->>AmaidesuCore: send_to_maicore(message_base)
    end
Loading

Class Diagram: VTubeStudioPlugin Lip Sync Enhancements (类图:VTubeStudioPlugin 唇形同步增强)

classDiagram
    class VTubeStudioPlugin {
      +lip_sync_enabled: bool
      +volume_threshold: float
      +smoothing_factor: float
      +vowel_detection_sensitivity: float
      +sample_rate: int
      +min_accumulation_duration: float
      +playback_sync_enabled: bool
      -audio_buffer: deque
      -current_vowel_values: Dict[str, float]
      -current_volume: float
      -is_speaking: bool
      -audio_analysis_lock: threading.Lock
      -accumulated_audio: bytearray
      -accumulation_start_time: float
      -audio_playback_start_time: float
      -vowel_formants: Dict[str, List[int]]
      +setup()
      +analyze_audio_chunk(audio_data: bytes, sample_rate: int) Dict[str, float]
      -_analyze_vowel_features(magnitude: np.ndarray, freqs: np.ndarray) Dict[str, float]
      +process_tts_audio(audio_data: bytes, sample_rate: int)
      -_update_lip_sync_parameters(analysis_result: Dict[str, float])
      +start_lip_sync_session(text: str)
      +stop_lip_sync_session()
      +reset_playback_timing()
    }
    VTubeStudioPlugin --|> BasePlugin
    VTubeStudioPlugin ..> AmaidesuCore : uses
Loading

Class Diagram: BiliDanmakuSeleniumPlugin File and Configuration Enhancements (类图:BiliDanmakuSeleniumPlugin 文件和配置增强)

classDiagram
    class BiliDanmakuSeleniumPlugin {
      +config: Dict[str, Any]
      +enabled: bool
      +enable_danmaku_save: bool
      +danmaku_save_file: str
      +save_file_path: Path
      +enable_danmaku_load: bool
      +danmaku_load_file: str
      +load_file_path: Path
      +skip_initial_danmaku: bool
      +data_dir: Path
      -is_initial_load: bool
      -initial_load_complete: bool
      -loaded_danmaku_queue: List[MessageBase]
      -loaded_danmaku_index: int
      -file_only_mode: bool
      -shutdown_timeout: int
      -cleanup_lock: threading.Lock
      -is_shutting_down: bool
      +__init__(core: AmaidesuCore, config: Dict[str, Any])
      -_setup_signal_handlers()
      -_graceful_shutdown()
      +setup()
      +cleanup()
      -_create_webdriver()
      -_run_monitoring_loop()
      -_run_file_replay_loop()
      -_run_live_monitoring_loop()
      -_fetch_and_process_messages()
      -_load_danmaku_from_file()
      -_save_danmaku_to_file(message_base: MessageBase)
      -_send_loaded_danmaku()
    }
    BiliDanmakuSeleniumPlugin --|> BasePlugin
    BiliDanmakuSeleniumPlugin ..> AmaidesuCore : uses
    BiliDanmakuSeleniumPlugin o-- MessageCacheService : uses
Loading

Class Diagram: TTS Plugins Integration with VTube Studio Lip Sync Service (类图:TTS 插件与 VTube Studio 唇形同步服务集成)

classDiagram
    class IVTubeStudioLipSyncService {
        <<Interface>>
        +start_lip_sync_session(text: str)
        +process_tts_audio(audio_data: bytes, sample_rate: int)
        +stop_lip_sync_session()
    }
    class TTSPlugin {
        -_speak(text: str)
        -_play_with_lip_sync(audio_data: np.ndarray, samplerate: int, vts_lip_sync_service: IVTubeStudioLipSyncService)
    }
    class GPTSoVitsTTSPlugin {
        -decode_and_buffer(wav_chunk)
        -_speak(text: str)
    }
    TTSPlugin ..> IVTubeStudioLipSyncService : uses
    GPTSoVitsTTSPlugin ..> IVTubeStudioLipSyncService : uses
    TTSPlugin --|> BasePlugin
    GPTSoVitsTTSPlugin --|> BasePlugin
Loading

File-Level Changes (文件级别更改)

Change (变更) Details (详情) Files (文件)
Enhance BiliDanmakuSeleniumPlugin with file save/load, skip-initial, pure-file replay and improved driver/config handling (增强 BiliDanmakuSeleniumPlugin,使其具有文件保存/加载、跳过初始弹幕、纯文件重放和改进的驱动程序/配置处理)
  • Switch to explicit config loading and validation (rename plugin_config to config) (切换到显式配置加载和验证(将 plugin_config 重命名为 config))
  • Clamp poll_interval and max_messages_per_check (限制 poll_interval 和 max_messages_per_check)
  • Implement skip_initial_danmaku flag to ignore historical messages (实现 skip_initial_danmaku 标志以忽略历史消息)
  • Add enable_danmaku_save/load, data directory creation and JSONL save/load APIs (添加 enable_danmaku_save/load、数据目录创建和 JSONL 保存/加载 API)
  • Introduce pure file mode with time-axis replay loop (引入具有时间轴重放循环的纯文件模式)
  • Support custom chromedriver_path with webdriver-manager fallback (支持自定义 chromedriver_path,并提供 webdriver-manager 回退)
  • Refactor monitoring loop into live and file replay branches (将监视循环重构为实时和文件重放分支)
src/plugins/bili_danmaku_selenium/plugin.py
src/plugins/bili_danmaku_selenium/README.md
src/plugins/bili_danmaku_selenium/config-template.toml
Add lip-sync functionality to VTubeStudioPlugin (向 VTubeStudioPlugin 添加唇形同步功能)
  • Import and detect audio analysis libraries (librosa, scipy) (导入并检测音频分析库 (librosa, scipy))
  • Load lip_sync config section and initialize analysis state variables (加载 lip_sync 配置部分并初始化分析状态变量)
  • Register vts_lip_sync service to Core (向 Core 注册 vts_lip_sync 服务)
  • Implement analyze_audio_chunk and _analyze_vowel_features for vowel detection (实现 analyze_audio_chunk 和 _analyze_vowel_features 以进行元音检测)
  • Add process_tts_audio, _update_lip_sync_parameters, start/stop/reset lip sync session methods (添加 process_tts_audio、_update_lip_sync_parameters、start/stop/reset 唇形同步会话方法)
  • Extend README and config-template with lip sync settings and update requirements (使用唇形同步设置扩展 README 和 config-template 并更新 requirements)
src/plugins/vtube_studio/plugin.py
src/plugins/vtube_studio/README.md
src/plugins/vtube_studio/config-template.toml
src/plugins/vtube_studio/requirements.txt
Integrate lip-sync sessions into TTSPlugin playback (将唇形同步会话集成到 TTSPlugin 播放中)
  • Retrieve vts_lip_sync service to start session before TTS speak (检索 vts_lip_sync 服务以在 TTS 说话之前启动会话)
  • Implement _play_with_lip_sync for chunked playback and real-time audio callbacks (实现 _play_with_lip_sync 以进行分块播放和实时音频回调)
  • Stop lip sync session after playback (播放后停止唇形同步会话)
src/plugins/tts/plugin.py
Hook GPTSoVITS TTS pipeline into lip-sync service (将 GPTSoVITS TTS 管道挂钩到唇形同步服务)
  • Invoke process_tts_audio in decode_and_buffer (在 decode_and_buffer 中调用 process_tts_audio)
  • Start and stop lip sync sessions around _speak (在 _speak 前后启动和停止唇形同步会话)
src/plugins/gptsovits_tts/plugin.py

Tips and commands (提示和命令)

Interacting with Sourcery (与 Sourcery 交互)

  • Trigger a new review: Comment @sourcery-ai review on the pull request. (触发新的审查: 在拉取请求上评论 @sourcery-ai review。)
  • Continue discussions: Reply directly to Sourcery's review comments. ( 继续讨论: 直接回复 Sourcery 的审查意见。)
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it. ( 从审查意见生成 GitHub 问题: 通过回复审查意见,要求 Sourcery 从审查意见创建一个问题。您也可以回复带有 @sourcery-ai issue 的审查意见以从中创建一个问题。)
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time. ( 生成拉取请求标题: 在拉取请求标题中的任何位置写入 @sourcery-ai 以随时生成标题。您也可以在拉取请求上评论 @sourcery-ai title 以随时(重新)生成标题。)
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time. ( 生成拉取请求摘要: 在拉取请求正文中的任何位置写入 @sourcery-ai summary 以在您想要的任何位置随时生成 PR 摘要。您也可以在拉取请求上评论 @sourcery-ai summary 以随时(重新)生成摘要。)
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time. ( 生成审阅者指南: 在拉取请求上评论 @sourcery-ai guide 以随时(重新)生成审阅者指南。)
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore. ( 解决所有 Sourcery 评论: 在拉取请求上评论 @sourcery-ai resolve 以解决所有 Sourcery 评论。如果您已经解决了所有评论并且不想再看到它们,这将非常有用。)
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review! ( 驳回所有 Sourcery 审查: 在拉取请求上评论 @sourcery-ai dismiss 以驳回所有现有的 Sourcery 审查。如果您想从新的审查开始,这将特别有用 - 不要忘记评论 @sourcery-ai review 以触发新的审查!)

Customizing Your Experience (自定义您的体验)

Access your dashboard to: (访问您的 仪表板 以:)

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others. (启用或禁用审查功能,例如 Sourcery 生成的拉取请求摘要、审阅者指南等。)
  • Change the review language. (更改审查语言。)
  • Add, remove or edit custom review instructions. (添加、删除或编辑自定义审查说明。)
  • Adjust other review settings. (调整其他审查设置。)

Getting Help (获取帮助)

```
Original review guide in English

Reviewer's Guide

This PR refactors and extends plugins to support lip-sync across VTube Studio and TTS pipelines, and enriches the BiliDanmakuSeleniumPlugin with file-based danmaku saving and replay, improved configuration handling, and robust WebDriver management.

Sequence Diagram: Lip Sync Audio Processing for VTube Studio

sequenceDiagram
    participant User as End User
    participant TTSPlugin as TTS Plugin (e.g., EdgeTTS, GPT-SoVITS)
    participant AmaidesuCore as Amaidesu Core
    participant VTubeStudioPlugin as VTubeStudio Plugin
    participant VTSApp as VTube Studio Application

    User->>AmaidesuCore: Initiate Action (e.g., send message)
    AmaidesuCore->>TTSPlugin: speak(text_to_synthesize)
    TTSPlugin->>AmaidesuCore: Get Service ("vts_lip_sync")
    AmaidesuCore-->>TTSPlugin: vts_lip_sync_service (VTubeStudioPlugin)
    TTSPlugin->>VTubeStudioPlugin: start_lip_sync_session(text_to_synthesize)
    loop Audio Stream Processing
        TTSPlugin->>TTSPlugin: Generate audio_chunk
        TTSPlugin->>VTubeStudioPlugin: process_tts_audio(audio_chunk, sample_rate)
        VTubeStudioPlugin->>VTubeStudioPlugin: analyze_audio_chunk(audio_chunk)
        VTubeStudioPlugin->>VTSApp: Set Lip Sync Parameters (VoiceVolume, MouthOpen, VoiceA, etc.)
    end
    TTSPlugin->>VTubeStudioPlugin: stop_lip_sync_session()
Loading

Sequence Diagram: Danmaku Replay from File in BiliDanmakuSeleniumPlugin

sequenceDiagram
    participant BiliDanmakuSeleniumPlugin as Plugin
    participant Filesystem
    participant AmaidesuCore

    Note over Plugin: In file_only_mode
    Plugin->>Plugin: setup() called
    Plugin->>Plugin: _load_danmaku_from_file()
    Plugin->>Filesystem: Read danmaku_load_file (e.g., danmaku_ROOMID.jsonl)
    Filesystem-->>Plugin: JSONL data lines
    loop For each line in JSONL data
        Plugin->>Plugin: Parse JSON to MessageBase object
        Plugin->>Plugin: Add MessageBase to loaded_danmaku_queue
    end
    Plugin->>Plugin: _run_file_replay_loop() starts
    loop For each message_base in loaded_danmaku_queue
        Plugin->>Plugin: Calculate wait_time (based on message_base.message_info.time)
        Plugin->>Plugin: await asyncio.wait_for(stop_event, timeout=wait_time)
        Plugin->>Plugin: message_cache_service.cache_message(message_base)
        Plugin->>AmaidesuCore: send_to_maicore(message_base)
    end
Loading

Sequence Diagram: Live Danmaku Saving in BiliDanmakuSeleniumPlugin

sequenceDiagram
    participant BiliDanmakuSeleniumPlugin as Plugin
    participant WebDriver
    participant Filesystem
    participant AmaidesuCore

    Note over Plugin: Live danmaku monitoring
    Plugin->>WebDriver: Fetch raw danmaku elements from Bilibili page
    WebDriver-->>Plugin: Raw danmaku elements
    Plugin->>Plugin: Parse elements into DanmakuMessage objects
    loop For each DanmakuMessage
        Plugin->>Plugin: _create_message_base(danmaku_message)
        Plugin->>Plugin: message_base created
        Plugin->>Plugin: message_cache_service.cache_message(message_base)
        opt enable_danmaku_save is true
            Plugin->>Plugin: _save_danmaku_to_file(message_base)
            Plugin->>Filesystem: Write MessageBase as JSON to danmaku_save_file
        end
        Plugin->>AmaidesuCore: send_to_maicore(message_base)
    end
Loading

Class Diagram: VTubeStudioPlugin Lip Sync Enhancements

classDiagram
    class VTubeStudioPlugin {
      +lip_sync_enabled: bool
      +volume_threshold: float
      +smoothing_factor: float
      +vowel_detection_sensitivity: float
      +sample_rate: int
      +min_accumulation_duration: float
      +playback_sync_enabled: bool
      -audio_buffer: deque
      -current_vowel_values: Dict[str, float]
      -current_volume: float
      -is_speaking: bool
      -audio_analysis_lock: threading.Lock
      -accumulated_audio: bytearray
      -accumulation_start_time: float
      -audio_playback_start_time: float
      -vowel_formants: Dict[str, List[int]]
      +setup()
      +analyze_audio_chunk(audio_data: bytes, sample_rate: int) Dict[str, float]
      -_analyze_vowel_features(magnitude: np.ndarray, freqs: np.ndarray) Dict[str, float]
      +process_tts_audio(audio_data: bytes, sample_rate: int)
      -_update_lip_sync_parameters(analysis_result: Dict[str, float])
      +start_lip_sync_session(text: str)
      +stop_lip_sync_session()
      +reset_playback_timing()
    }
    VTubeStudioPlugin --|> BasePlugin
    VTubeStudioPlugin ..> AmaidesuCore : uses
Loading

Class Diagram: BiliDanmakuSeleniumPlugin File and Configuration Enhancements

classDiagram
    class BiliDanmakuSeleniumPlugin {
      +config: Dict[str, Any]
      +enabled: bool
      +enable_danmaku_save: bool
      +danmaku_save_file: str
      +save_file_path: Path
      +enable_danmaku_load: bool
      +danmaku_load_file: str
      +load_file_path: Path
      +skip_initial_danmaku: bool
      +data_dir: Path
      -is_initial_load: bool
      -initial_load_complete: bool
      -loaded_danmaku_queue: List[MessageBase]
      -loaded_danmaku_index: int
      -file_only_mode: bool
      -shutdown_timeout: int
      -cleanup_lock: threading.Lock
      -is_shutting_down: bool
      +__init__(core: AmaidesuCore, config: Dict[str, Any])
      -_setup_signal_handlers()
      -_graceful_shutdown()
      +setup()
      +cleanup()
      -_create_webdriver()
      -_run_monitoring_loop()
      -_run_file_replay_loop()
      -_run_live_monitoring_loop()
      -_fetch_and_process_messages()
      -_load_danmaku_from_file()
      -_save_danmaku_to_file(message_base: MessageBase)
      -_send_loaded_danmaku()
    }
    BiliDanmakuSeleniumPlugin --|> BasePlugin
    BiliDanmakuSeleniumPlugin ..> AmaidesuCore : uses
    BiliDanmakuSeleniumPlugin o-- MessageCacheService : uses
Loading

Class Diagram: TTS Plugins Integration with VTube Studio Lip Sync Service

classDiagram
    class IVTubeStudioLipSyncService {
        <<Interface>>
        +start_lip_sync_session(text: str)
        +process_tts_audio(audio_data: bytes, sample_rate: int)
        +stop_lip_sync_session()
    }
    class TTSPlugin {
        -_speak(text: str)
        -_play_with_lip_sync(audio_data: np.ndarray, samplerate: int, vts_lip_sync_service: IVTubeStudioLipSyncService)
    }
    class GPTSoVitsTTSPlugin {
        -decode_and_buffer(wav_chunk)
        -_speak(text: str)
    }
    TTSPlugin ..> IVTubeStudioLipSyncService : uses
    GPTSoVitsTTSPlugin ..> IVTubeStudioLipSyncService : uses
    TTSPlugin --|> BasePlugin
    GPTSoVitsTTSPlugin --|> BasePlugin
Loading

File-Level Changes

Change Details Files
Enhance BiliDanmakuSeleniumPlugin with file save/load, skip-initial, pure-file replay and improved driver/config handling
  • Switch to explicit config loading and validation (rename plugin_config to config)
  • Clamp poll_interval and max_messages_per_check
  • Implement skip_initial_danmaku flag to ignore historical messages
  • Add enable_danmaku_save/load, data directory creation and JSONL save/load APIs
  • Introduce pure file mode with time-axis replay loop
  • Support custom chromedriver_path with webdriver-manager fallback
  • Refactor monitoring loop into live and file replay branches
src/plugins/bili_danmaku_selenium/plugin.py
src/plugins/bili_danmaku_selenium/README.md
src/plugins/bili_danmaku_selenium/config-template.toml
Add lip-sync functionality to VTubeStudioPlugin
  • Import and detect audio analysis libraries (librosa, scipy)
  • Load lip_sync config section and initialize analysis state variables
  • Register vts_lip_sync service to Core
  • Implement analyze_audio_chunk and _analyze_vowel_features for vowel detection
  • Add process_tts_audio, _update_lip_sync_parameters, start/stop/reset lip sync session methods
  • Extend README and config-template with lip sync settings and update requirements
src/plugins/vtube_studio/plugin.py
src/plugins/vtube_studio/README.md
src/plugins/vtube_studio/config-template.toml
src/plugins/vtube_studio/requirements.txt
Integrate lip-sync sessions into TTSPlugin playback
  • Retrieve vts_lip_sync service to start session before TTS speak
  • Implement _play_with_lip_sync for chunked playback and real-time audio callbacks
  • Stop lip sync session after playback
src/plugins/tts/plugin.py
Hook GPTSoVITS TTS pipeline into lip-sync service
  • Invoke process_tts_audio in decode_and_buffer
  • Start and stop lip sync sessions around _speak
src/plugins/gptsovits_tts/plugin.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @tcmofashi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Bilibili Selenium plugin with danmaku saving and replaying capabilities, including an offline replay mode. Additionally, it significantly upgrades the VTube Studio plugin by adding real-time lip synchronization driven by the audio output from TTS plugins, providing a more dynamic and responsive avatar experience.

Highlights

  • VTube Studio Lip Sync: Adds real-time lip synchronization functionality to the VTube Studio plugin. It integrates with TTS plugins (like Edge TTS and GPT-SoVITS) to analyze the audio output during speech and control VTS parameters such as VoiceVolume, VoiceSilence, and individual vowel parameters (VoiceA, I, U, E, O) for more realistic avatar mouth movements. Includes configuration options for sensitivity, smoothing, and playback time synchronization.
  • Bilibili Danmaku Save/Replay: Introduces the ability to save incoming danmaku messages from the Selenium plugin to a JSONL file. It also adds a feature to load and replay danmaku from a saved file, including a 'pure file mode' that operates offline and attempts to synchronize the replay timing based on the original message timestamps.
  • Bilibili Danmaku Initial Skip: Adds a configuration option (skip_initial_danmaku) to the Bilibili Selenium plugin to ignore messages that are already present in the chat when the plugin starts, focusing only on newly arriving messages.
  • Configuration Updates: Adds new configuration options in config-template.toml for both the Bilibili Selenium plugin (danmaku save/load paths, skip initial, chromedriver path) and the VTube Studio plugin (detailed lip sync parameters).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configureGemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements, notably file-based danmaku saving/replaying for the Bilibili plugin and real-time lip-sync for VTube Studio, integrated with TTS plugins. The documentation and configuration templates have been thoroughly updated.

Key points for review include:

  • A potential issue in bili_danmaku_selenium/plugin.py where live danmaku might not be processed by MaiCore if enable_danmaku_load is false.
  • In tts/plugin.py, the method for scheduling asyncio tasks from the sounddevice callback thread should be revised to use asyncio.run_coroutine_threadsafe.
  • A minor typo in a comment in vtube_studio/plugin.py.

await self.core.send_to_maicore(message_base)

# 发送消息
# await self.core.send_to_maicore(message_base)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The line await self.core.send_to_maicore(message_base) is currently commented out. This means that live messages fetched by _fetch_and_process_messages are cached and potentially saved (if enable_danmaku_save is true), but they are not sent to MaiCore for processing unless enable_danmaku_load is also true and they are subsequently loaded from the file.

If the intention is for live danmaku to be processed in real-time regardless of the load/save settings (which is typical for a live danmaku plugin), this line should likely be uncommented. Otherwise, the documentation should clearly state that live messages are only processed if they are saved and then loaded, or if enable_danmaku_load is active for some other reason.

Could you clarify the intended behavior here? If live messages should always be processed, please uncomment this line.

Comment thread src/plugins/tts/plugin.py Outdated
Comment on lines +366 to +368
asyncio.create_task(
vts_lip_sync_service.process_tts_audio(chunk_bytes, sample_rate=samplerate)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The audio_callback is executed by sounddevice in a separate, non-asyncio thread. Calling asyncio.create_task directly from such a thread might not schedule the coroutine on the main event loop where AmaidesuCore and other asyncio components are running. This can lead to unexpected behavior or errors if the coroutine tries to interact with objects tied to the main loop.

To safely schedule vts_lip_sync_service.process_tts_audio on the main event loop from this callback thread, you should use asyncio.run_coroutine_threadsafe.

Suggested change
asyncio.create_task(
vts_lip_sync_service.process_tts_audio(chunk_bytes, sample_rate=samplerate)
)
# Get the main event loop once before starting the stream if not already available
# main_loop = asyncio.get_running_loop() # Assuming this method has access to the loop
coro = vts_lip_sync_service.process_tts_audio(chunk_bytes, sample_rate=samplerate)
asyncio.run_coroutine_threadsafe(coro, asyncio.get_running_loop()) # Or pass the specific loop

danmaku_data = json.loads(line.strip())

# 将字典转换为MessageBase对象
from maim_message import MessageBase
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The import from maim_message import MessageBase is done locally within the _load_danmaku_from_file method. MessageBase is already imported at the module level (line 37). Rely on module-level imports for consistency and to avoid potential shadowing or repeated import overhead. Remove this local import and use the MessageBase already available in the module's scope.

@ChangingSelf ChangingSelf merged commit 7719805 into Mai-with-u:dev Jun 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants