You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
set_model() in app/ai.py uses max_length=1024 when
initializing the pipeline, while __init__() correctly uses max_new_tokens=1024.
These two parameters behave differently:
max_length limits the total tokens including the input prompt
max_new_tokens limits only the generated output tokens
This inconsistency means that after calling /change-model,
responses may be significantly shorter than expected because
input tokens consume part of the max_length budget.
This was missed in commit 659ff99 which updated __init__
from max_length to max_new_tokens but did not update set_model.
Fix
Changed max_length=1024 to max_new_tokens=1024 in set_model() to match the behavior in __init__().
Testing
The fix is verified by code inspection — set_model() now
matches the pipeline initialization parameters used in __init__(). Both now use max_new_tokens=1024 consistently.
set_model() in app/ai.py uses max_length=1024 when initializing the pipeline, while __init__() correctly uses max_new_tokens=1024.
These two parameters behave differently:
max_length limits the total tokens including the input prompt
max_new_tokens limits only the generated output tokens
This inconsistency means that after calling /change-model, responses may be significantly shorter than expected because input tokens consume part of the max_length budget.
This was missed in commit 659ff99 which updated __init__ from max_length to max_new_tokens but did not update set_model.
Fix
Changed max_length=1024 to max_new_tokens=1024 in set_model() to match the behavior in __init__().
Testing
The fix is verified by code inspection — set_model() now matches the pipeline initialization parameters used in __init__(). Both now use max_new_tokens=1024 consistently.
Problem
set_model() in app/ai.py uses max_length=1024 when initializing the pipeline, while __init__() correctly uses max_new_tokens=1024.
These two parameters behave differently:
max_length limits the total tokens including the input prompt
max_new_tokens limits only the generated output tokens
This inconsistency means that after calling /change-model, responses may be significantly shorter than expected because input tokens consume part of the max_length budget.
This was missed in commit 659ff99 which updated __init__ from max_length to max_new_tokens but did not update set_model.
Fix
Changed max_length=1024 to max_new_tokens=1024 in set_model() to match the behavior in __init__().
Testing
The fix is verified by code inspection — set_model() now matches the pipeline initialization parameters used in __init__(). Both now use max_new_tokens=1024 consistently.
Note: This PR was developed with AI assistance (Claude). As per Sugar Labs contributing guidelines, I'm disclosing this. The AI helped me structure the fix, but I verified, and tested each change myself.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
set_model()inapp/ai.pyusesmax_length=1024wheninitializing the pipeline, while
__init__()correctly usesmax_new_tokens=1024.These two parameters behave differently:
max_lengthlimits the total tokens including the input promptmax_new_tokenslimits only the generated output tokensThis inconsistency means that after calling
/change-model,responses may be significantly shorter than expected because
input tokens consume part of the
max_lengthbudget.This was missed in commit 659ff99 which updated
__init__from
max_lengthtomax_new_tokensbut did not updateset_model.Fix
Changed
max_length=1024tomax_new_tokens=1024inset_model()to match the behavior in__init__().Testing
The fix is verified by code inspection —
set_model()nowmatches the pipeline initialization parameters used in
__init__(). Both now usemax_new_tokens=1024consistently.