Skip to content

Minor Typo in Figure Reference #129

@YUCHENYUXI

Description

@YUCHENYUXI

Minor Typo in Figure Reference

A minor typo was found in the paper "VITA: Towards Open-Source Interactive Omni Multimodal LLM", specifically in a figure reference. This issue aims to correct the reference for clarity and accuracy.

Paper: VITA: Towards Open-Source Interactive Omni Multimodal LLM
ArXiv Version: arXiv:2408.05211v3 (30 May 2025)


Issue Details

Section: 3.4.2 Audio Interrupt Interaction

Original Text:

To achieve this, we propose the duplex deployment framework... As illustrated in Fig.1, two VITA models are deployed concurrently.

Proposed Correction:

To achieve this, we propose the duplex deployment framework... As illustrated in Fig. 2, two VITA models are deployed concurrently.


Reasoning

  • The duplex deployment scheme, which involves two concurrently deployed VITA models, is explicitly shown in Figure 2.
  • The paper's "Introduction" section correctly references this architecture, stating: "As shown in Fig. 2, two VITA models are deployed simultaneously: one is responsible for generating responses to user queries, and the other continuously tracks environmental inputs...".
  • Figure 1 is titled "Interaction of VITA" and demonstrates the user interaction flow, not the underlying two-model architecture.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions