Skip to content

[bugfix] fix bug in srt_api.py#826

Merged
Luodian merged 2 commits into
EvolvingLMMs-Lab:mainfrom
zzhbrr:fix_bug_sglang_backend
Sep 18, 2025
Merged

[bugfix] fix bug in srt_api.py#826
Luodian merged 2 commits into
EvolvingLMMs-Lab:mainfrom
zzhbrr:fix_bug_sglang_backend

Conversation

@zzhbrr
Copy link
Copy Markdown
Contributor

@zzhbrr zzhbrr commented Sep 17, 2025

Fix bug in srt_api.py, after fixing it, the problem in #823 is solved.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed a typo and corrected behavior so the time annotation is only inserted for video inputs when the "add time instruction" option is enabled and image frames are present; non-video modalities remain unchanged.
  • Chores

    • Minor formatting/indentation cleanup across generation paths. No changes to error handling, outputs, or public interfaces.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 17, 2025

Walkthrough

Time-instruction handling in generation was fixed and gated: the variable name was corrected to time_instruction, and the time instruction is now inserted only when add_time_instruction is true, modality == "video", and imgs is not None; applied to both async generate and generate_sync.

Changes

Cohort / File(s) Summary
SRT time-instruction logic
lmms_eval/models/simple/srt_api.py
Fixed typo (time_instrucitontime_instruction) and changed unconditional insertion to a conditional insertion of the time instruction only when add_time_instruction is true, modality == "video", and imgs is not None. Change applied in both async generate and generate_sync.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant SRT_API
    participant VideoDecoder

    Caller->>SRT_API: generate(..., modality, imgs, add_time_instruction)
    alt modality != "video" or imgs is None or add_time_instruction == false
        SRT_API-->>Caller: proceed without time instruction
    else modality == "video" and imgs exists and add_time_instruction == true
        SRT_API->>VideoDecoder: decode video (get frame_time, video_time)
        VideoDecoder-->>SRT_API: frame_time, video_time
        SRT_API->>SRT_API: build time_instruction using frame_time/video_time
        SRT_API-->>Caller: include time instruction in prompt
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I fixed a name and checked the time,
Only add it when the frames align.
A tidy hop, a careful bite,
The captions hum through day and night. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The PR description is minimal: it says the bug was fixed and links issue #823 but does not follow the repository template's checklist or provide a detailed description of what changed, which files/lines were modified, or how the fix was tested, so it is incomplete relative to the template. Please expand the PR description to follow the template by confirming the descriptive title, adding a detailed summary of the code changes (what was changed in srt_api.py and why), linking relevant commits and issue #823, and including testing steps/results and any lint/pre-commit instructions.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "[bugfix] fix bug in srt_api.py" correctly identifies the primary file changed (lmms_eval/models/simple/srt_api.py) and indicates this is a bugfix, making it concise and relevant for reviewers though somewhat generic.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
lmms_eval/models/simple/srt_api.py (3)

185-189: Prevent NameError; fix typo and spacing; wrap to 88 cols

time_instruciton is misspelled, may be undefined (non‑video or decode error), and the sentence lacks a space before “Please”. Also exceeds 88 chars.

Apply:

-                time_instruciton = f"The video lasts for {video_time:.2f} seconds, and {len(frames)} frames are uniformly sampled from it. These frames are located at {frame_time}.Please answer the following questions related to this video."
+                time_instruction = (
+                    f"The video lasts for {video_time:.2f} seconds, and "
+                    f"{len(frames)} frames are uniformly sampled from it. "
+                    f"These frames are located at {frame_time}. Please answer "
+                    f"the following questions related to this video."
+                )
-        if self.add_time_instruction:
-            contexts = f"{time_instruciton}\n{contexts}"
+        if (
+            self.add_time_instruction
+            and self.modality == "video"
+            and "time_instruction" in locals()
+        ):
+            contexts = f"{time_instruction}\n{contexts}"

Suggested cleaner follow‑up (outside this hunk): initialize time_instruction: str | None = None before the loop; set it in the try block; then check if self.add_time_instruction and time_instruction: to avoid locals() usage.


246-250: Mirror the async fix in sync path

Same issues as above: typo, potential undefined variable, missing space, long line. Keep both paths consistent.

Apply:

-                time_instruciton = f"The video lasts for {video_time:.2f} seconds, and {len(frames)} frames are uniformly sampled from it. These frames are located at {frame_time}.Please answer the following questions related to this video."
+                time_instruction = (
+                    f"The video lasts for {video_time:.2f} seconds, and "
+                    f"{len(frames)} frames are uniformly sampled from it. "
+                    f"These frames are located at {frame_time}. Please answer "
+                    f"the following questions related to this video."
+                )
-        if self.add_time_instruction:
-            contexts = f"{time_instruciton}\n{contexts}"
+        if (
+            self.add_time_instruction
+            and self.modality == "video"
+            and "time_instruction" in locals()
+        ):
+            contexts = f"{time_instruction}\n{contexts}"

Same optional cleanup as suggested for the async path.


294-294: Attribute error: use _rank (defined) instead of rank (undefined)

self.rank isn’t set; _rank is. Current code will raise AttributeError.

Apply:

-        pbar = tqdm(total=len(requests), disable=(self.rank != 0), desc="Model Responding")
+        pbar = tqdm(
+            total=len(requests),
+            disable=(self._rank != 0),
+            desc="Model Responding",
+        )
🧹 Nitpick comments (2)
lmms_eval/models/simple/srt_api.py (2)

287-289: Loguru formatting misuse; arguments ignored

loguru expects {} placeholders. Current calls won’t print values.

Apply:

-        eval_logger.info("Question:", contexts)
-        eval_logger.info("Answer:", response_text)
+        eval_logger.info("Question: {}", contexts)
+        eval_logger.info("Answer: {}", response_text)

121-128: Type hints and public docstrings missing (repo guidelines)

File lacks type hints and docstrings for public APIs (e.g., encode_image, encode_video, generate, generate_sync, generate_until). Please add them to comply with repo rules.

I can draft precise type annotations and concise docstrings if helpful.

Also applies to: 129-159, 167-176, 196-205, 257-266, 320-326

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8b44ff and ea8a87b.

📒 Files selected for processing (1)
  • lmms_eval/models/simple/srt_api.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Type hints are required for all Python code
Public APIs must have docstrings
Maximum line length is 88 characters
Use PEP 8 naming: snake_case for functions/variables
Class names must use PascalCase
Constants should be in UPPER_SNAKE_CASE
Use f-strings for string formatting
Use early returns to avoid nested conditions
Use descriptive names; prefix handler functions with 'handle'
Prefer constants over functions where possible
Prefer functional, immutable approaches when not verbose
Define composing (higher-level) functions before their components
Mark issues in existing code with TODO: prefix in comments
Use functional and stateless approaches where they improve clarity
Use Ruff to enforce: import sorting (I001) and no unused imports
For long strings, wrap using parentheses rather than backslashes
Format long function calls over multiple lines with proper indentation
Split long import lists across multiple lines
Use Pyright type checking: add explicit None checks for Optional values
Use Pyright type narrowing for strings where applicable
Use Ruff (via pre-commit) to format and lint Python files
Document public APIs and test thoroughly

Files:

  • lmms_eval/models/simple/srt_api.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lmms_eval/models/simple/srt_api.py (1)

218-224: Avoid blocking sleep in async path (use asyncio.sleep).

time.sleep blocks the event loop in generate(), degrading concurrency.

-                if attempt < 4:
-                    time.sleep(NUM_SECONDS_TO_SLEEP)
+                if attempt < 4:
+                    await asyncio.sleep(NUM_SECONDS_TO_SLEEP)
🧹 Nitpick comments (2)
lmms_eval/models/simple/srt_api.py (2)

185-189: Good gating; move time_instruction inside try, wrap to 88 cols, and fix spacing.

Keeps variables in-scope only on success, conforms to line-length rule, and fixes ".Please" spacing. The else branch is redundant.

Confirm there’s at most one video per request; otherwise this instruction reflects only the last processed video.

-                time_instruction = f"The video lasts for {video_time:.2f} seconds, and {len(frames)} frames are uniformly sampled from it. These frames are located at {frame_time}.Please answer the following questions related to this video."
-        if self.add_time_instruction and self.modality == "video" and imgs is not None:
-            contexts = f"{time_instruction}\n{contexts}"
-        else:
-            contexts = f"{contexts}"
+        if self.add_time_instruction and self.modality == "video" and imgs is not None:
+            contexts = f"{time_instruction}\n{contexts}"

And inside the try block above:

                 try:
-                    frames, frame_time, video_time = self.encode_video(visual, self.max_frames_num)
-                    imgs.extend(frames)
+                    frames, frame_time, video_time = self.encode_video(
+                        visual, self.max_frames_num
+                    )
+                    imgs.extend(frames)
+                    time_instruction = (
+                        f"The video lasts for {video_time:.2f} seconds, and "
+                        f"{len(frames)} frames are uniformly sampled from it. "
+                        f"These frames are located at {frame_time}. "
+                        "Please answer the following questions related to this video."
+                    )

246-251: Mirror the same string wrap/spacing and drop redundant else in generate_sync.

Consistent formatting, 88-col compliance, and cleaner control flow.

-                time_instruction = f"The video lasts for {video_time:.2f} seconds, and {len(frames)} frames are uniformly sampled from it. These frames are located at {frame_time}.Please answer the following questions related to this video."
-        if self.add_time_instruction and self.modality == "video" and imgs is not None:
-            contexts = f"{time_instruction}\n{contexts}"
-        else:
-            contexts = f"{contexts}"
+        if self.add_time_instruction and self.modality == "video" and imgs is not None:
+            contexts = f"{time_instruction}\n{contexts}"

And inside the try block above:

                 try:
-                    frames, frame_time, video_time = self.encode_video(visual, self.max_frames_num)
-                    imgs.extend(frames)
+                    frames, frame_time, video_time = self.encode_video(
+                        visual, self.max_frames_num
+                    )
+                    imgs.extend(frames)
+                    time_instruction = (
+                        f"The video lasts for {video_time:.2f} seconds, and "
+                        f"{len(frames)} frames are uniformly sampled from it. "
+                        f"These frames are located at {frame_time}. "
+                        "Please answer the following questions related to this video."
+                    )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea8a87b and 6911aaf.

📒 Files selected for processing (1)
  • lmms_eval/models/simple/srt_api.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Type hints are required for all Python code
Public APIs must have docstrings
Maximum line length is 88 characters
Use PEP 8 naming: snake_case for functions/variables
Class names must use PascalCase
Constants should be in UPPER_SNAKE_CASE
Use f-strings for string formatting
Use early returns to avoid nested conditions
Use descriptive names; prefix handler functions with 'handle'
Prefer constants over functions where possible
Prefer functional, immutable approaches when not verbose
Define composing (higher-level) functions before their components
Mark issues in existing code with TODO: prefix in comments
Use functional and stateless approaches where they improve clarity
Use Ruff to enforce: import sorting (I001) and no unused imports
For long strings, wrap using parentheses rather than backslashes
Format long function calls over multiple lines with proper indentation
Split long import lists across multiple lines
Use Pyright type checking: add explicit None checks for Optional values
Use Pyright type narrowing for strings where applicable
Use Ruff (via pre-commit) to format and lint Python files
Document public APIs and test thoroughly

Files:

  • lmms_eval/models/simple/srt_api.py

@Luodian Luodian merged commit 52fc5fe into EvolvingLMMs-Lab:main Sep 18, 2025
1 of 2 checks passed
Luodian added a commit that referenced this pull request Feb 28, 2026
* fix bug

* fix: Correct time_instruction variable name and add modality check for video context

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
stisiTT pushed a commit to bgoelTT/lmms-eval that referenced this pull request Mar 6, 2026
* fix bug

* fix: Correct time_instruction variable name and add modality check for video context

---------

Co-authored-by: Bo Li <drluodian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants