Every request by default has an upper bound of 1000 token usage. Due to this, the AI model returns responses which extra-long responses for easy prompts. Extra content or multiple answers are added for prompts for which the user wouldn't expect long responses.
Solution:
Add three flags indicating the expected response length - short, medium, and long. Users can use these three flags according to his/her expectations of the content length.