[BUG] Litellm: max_tokens no supported #341

enoch3712 · 2025-06-09T14:51:18Z

No description provided.

Copilot

Pull Request Overview

This PR updates model references, test expectations, and LLM token-handling logic to support new LLM variants and token limit parameters.

Tests now import and use get_gpt_o4_model, get_gpt_mini_model, get_gemini_flash_model, and adjust expected values for outputs.
The LLM class gains DEFAULT_OUTPUT_TOKENS, thinking_token_limit, and refactors thinking-mode parameter handling, replacing asyncio loop management with asyncio.run.
Global model getters are extended, and the extractor handles dict sources via DocumentLoaderData.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/test_process.py	Switched to direct model string in splitter and adjusted expected age.
tests/test_extractor.py	Expanded model getters, renamed budget fields, added thinking-mode tests, and updated assertions.
extract_thinker/llm.py	Introduced default output tokens, thinking-token-limit, and refactored async and thinking-parameter flows.
extract_thinker/global_models.py	Added new model getters for GPT and Gemini Flash models.
extract_thinker/extractor.py	Auto-loads `DocumentLoaderData` when source is a dict.
extract_thinker/concatenation_handler.py	Added guard against empty responses in JSON continuation check.

Comments suppressed due to low confidence (4)

tests/test_extractor.py:310

The assertion uses threshold=0.8 (80%) but the error message says 90%. Update the message to match the actual threshold.

assert semantically_similar(
        result_1.title,
        result_2.title,
        threshold=0.8
    ), "Titles are not semantically similar enough (threshold: 90%)"

extract_thinker/global_models.py:9

[nitpick] Both get_lite_model() and get_big_model() return the same string; if they are intended to be different, adjust accordingly or clarify why they match.

    return "gemini/gemini-2.5-flash-preview-05-20"

extract_thinker/extractor.py:219

Ensure DocumentLoaderData is imported at the top of the module; otherwise, this will raise a NameError at runtime.

            self.document_loader = DocumentLoaderData()

extract_thinker/llm.py:175

Confirm that import asyncio exists at module scope. If it was removed, calls to asyncio.run will fail with NameError.

result = asyncio.run(

Copilot · 2025-06-09T14:54:53Z

tests/test_extractor.py

+    #TODO handbook is 403 error now. So we need to refactor the test to use a different url.
+
    # Check handbook data
-    assert "FCA Handbook" in result.title, f"Expected title to contain 'FCA Handbook', but got: {result.title}"
+    #assert "FCA Handbook" in result.title, f"Expected title to contain 'FCA Handbook', but got: {result.title}"
+    assert result.title == "OpenAI"


[nitpick] Remove or resolve this TODO before merging to avoid lingering technical debt in tests.

Copilot · 2025-06-09T14:54:54Z

tests/test_extractor.py

+if __name__ == "__main__":
+    test_chart_with_content()


[nitpick] This __main__ block and manual test call are unnecessary for pytest; consider removing it.

Suggested change

if __name__ == "__main__":

test_chart_with_content()

# Removed the __main__ block and manual test call.

Copilot · 2025-06-09T14:54:54Z

tests/test_process.py

    # Arrange
    process, classifications = setup_process_and_classifications()
-    process.load_splitter(ImageSplitter(get_big_model()))
+    process.load_splitter(ImageSplitter("gemini/gemini-2.5-flash-preview-05-20"))


[nitpick] Rather than hard-coding the model string, use get_big_model() for consistency and easier future updates.

Suggested change

process.load_splitter(ImageSplitter("gemini/gemini-2.5-flash-preview-05-20"))

process.load_splitter(ImageSplitter(get_big_model()))

Copilot · 2025-06-09T14:54:55Z

extract_thinker/llm.py


    def _request_with_router(self, messages: List[Dict[str, str]], response_model: Optional[str]) -> Any:
        """Handle request using router with or without thinking parameter"""
+        max_tokens = self.DEFAULT_OUTPUT_TOKENS


[nitpick] The logic to compute max_tokens is duplicated across several methods; consider extracting it into a helper to reduce duplication.

Fix token count for sevaral use cases

f3587ee

Copilot AI review requested due to automatic review settings June 9, 2025 14:51

enoch3712 linked an issue Jun 9, 2025 that may be closed by this pull request

[BUG] Litellm: max_tokens no supported #327

Closed

Copilot AI reviewed Jun 9, 2025

View reviewed changes

enoch3712 merged commit 9b62574 into main Jun 9, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Litellm: max_tokens no supported #341

[BUG] Litellm: max_tokens no supported #341

Uh oh!

enoch3712 commented Jun 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 9, 2025

Uh oh!

Copilot AI Jun 9, 2025

Uh oh!

Copilot AI Jun 9, 2025

Uh oh!

Copilot AI Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if __name__ == "__main__":
	test_chart_with_content()
	# Removed the __main__ block and manual test call.

	process.load_splitter(ImageSplitter("gemini/gemini-2.5-flash-preview-05-20"))
	process.load_splitter(ImageSplitter(get_big_model()))

[BUG] Litellm: max_tokens no supported #341

[BUG] Litellm: max_tokens no supported #341

Uh oh!

Conversation

enoch3712 commented Jun 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants