Skip to content

fix: use max_new_tokens instead of max_length in set_model#54

Open
AjayBandiwaddar wants to merge 1 commit intosugarlabs:mainfrom
AjayBandiwaddar:fix/set-model-max-new-tokens
Open

fix: use max_new_tokens instead of max_length in set_model#54
AjayBandiwaddar wants to merge 1 commit intosugarlabs:mainfrom
AjayBandiwaddar:fix/set-model-max-new-tokens

Conversation

@AjayBandiwaddar
Copy link

Problem

set_model() in app/ai.py uses max_length=1024 when
initializing the pipeline, while __init__() correctly uses
max_new_tokens=1024.

These two parameters behave differently:

  • max_length limits the total tokens including the input prompt
  • max_new_tokens limits only the generated output tokens

This inconsistency means that after calling /change-model,
responses may be significantly shorter than expected because
input tokens consume part of the max_length budget.

This was missed in commit 659ff99 which updated __init__
from max_length to max_new_tokens but did not update
set_model.

Fix

Changed max_length=1024 to max_new_tokens=1024 in
set_model() to match the behavior in __init__().

Testing

The fix is verified by code inspection — set_model() now
matches the pipeline initialization parameters used in
__init__(). Both now use max_new_tokens=1024 consistently.

@AjayBandiwaddar
Copy link
Author

Problem

set_model() in app/ai.py uses max_length=1024 when initializing the pipeline, while __init__() correctly uses max_new_tokens=1024.

These two parameters behave differently:

  • max_length limits the total tokens including the input prompt
  • max_new_tokens limits only the generated output tokens

This inconsistency means that after calling /change-model, responses may be significantly shorter than expected because input tokens consume part of the max_length budget.

This was missed in commit 659ff99 which updated __init__ from max_length to max_new_tokens but did not update set_model.

Fix

Changed max_length=1024 to max_new_tokens=1024 in set_model() to match the behavior in __init__().

Testing

The fix is verified by code inspection — set_model() now matches the pipeline initialization parameters used in __init__(). Both now use max_new_tokens=1024 consistently.

Problem

set_model() in app/ai.py uses max_length=1024 when initializing the pipeline, while __init__() correctly uses max_new_tokens=1024.

These two parameters behave differently:

  • max_length limits the total tokens including the input prompt
  • max_new_tokens limits only the generated output tokens

This inconsistency means that after calling /change-model, responses may be significantly shorter than expected because input tokens consume part of the max_length budget.

This was missed in commit 659ff99 which updated __init__ from max_length to max_new_tokens but did not update set_model.

Fix

Changed max_length=1024 to max_new_tokens=1024 in set_model() to match the behavior in __init__().

Testing

The fix is verified by code inspection — set_model() now matches the pipeline initialization parameters used in __init__(). Both now use max_new_tokens=1024 consistently.

Note: This PR was developed with AI assistance (Claude). As per Sugar Labs contributing guidelines, I'm disclosing this. The AI helped me structure the fix, but I verified, and tested each change myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant