Skip to content

feat: Use structured outputs for more control over response #5195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: mealie-next
Choose a base branch
from

Conversation

dbz10
Copy link

@dbz10 dbz10 commented Mar 9, 2025

What this PR does / why we need it:

Hi,

This pull request leverages structured outputs to provide more granular control + guarantees over the responses from LLM providers. In a few of the usages in this project we expect a specific json object back. While mealie is already providing the expected json schema in the prompt this PR goes one step further and enforces that this schema is (allegedly) guaranteed to be respected in the response.

In theory this makes LLM calling more reliable and possibly makes prompting a bit more ergonomic in the future as well as pleading with LLMs to please for the love of all that is holy return json in the desired format or else my whole family will die should be less necessary.

Which issue(s) this PR fixes:

No specific related issue that I saw.

Special notes for your reviewer:

I tried to join the discord server but the invite was invalid. Happy to hear if I should have communicated prior to submitting this PR somewhere.

Testing

I ran unit test using task py:check in the devcontainer and got

1230 passed, 18 skipped, 153 warnings in 3069.26s (0:51:09)

I don't believe the tests actually executed any calls to OpenAI, especially since I didn't provide my API key, so I'm happy to take feedback or suggestions on additional testing I can do to validate this PR. I'm just getting started with mealie and don't have an extensive database of recipes nor URLs, etc, to test it on so would appreciate any suggestions how to go about that.

@michael-genson
Copy link
Collaborator

michael-genson commented Mar 13, 2025

Looks good! I will try to find some time this weekend to toy around with it and make sure it works as expected, but based on my understanding of how structured outputs works this should work great.

I don't believe the tests actually executed any calls to OpenAI

This is correct, we mock the call to OpenAI (since otherwise, as you said, you'd need to provide a key). I'll test manually with my key when I get the time (and welcome others to do the same!).

@michael-genson
Copy link
Collaborator

I tried to join the discord server but the invite was invalid

Where was the invalid invite? We should get that updated. Here's a new one: https://discord.gg/qA9zCWB5ay

@michael-genson michael-genson self-assigned this Mar 13, 2025
@dbz10
Copy link
Author

dbz10 commented Mar 16, 2025

I tried to join the discord server but the invite was invalid

Where was the invalid invite? We should get that updated. Here's a new one: https://discord.gg/qA9zCWB5ay

Ok sorry in retrospect I think this was my own issue - web browser version of Discord had gotten logged out and all it said was 'whoops unable to accept invite'. after logging in I was able to join via the link.

@dbz10
Copy link
Author

dbz10 commented Mar 16, 2025

Looks good! I will try to find some time this weekend to toy around with it and make sure it works as expected, but based on my understanding of how structured outputs works this should work great.

I don't believe the tests actually executed any calls to OpenAI

This is correct, we mock the call to OpenAI (since otherwise, as you said, you'd need to provide a key). I'll test manually with my key when I get the time (and welcome others to do the same!).

Yeah I think I can put in a bit more effort here and test it myself as well since I realized it's not a pre-requisite to already have user data in mealie to do so 😅

Just to confirm my understanding of where openai functionality is currently used so that I know what functionality to test out -

  • scrape a recipe from pointing at a url (scraping + parsing
  • generate a recipe from an image

is there anything else I should test? I see there's a prompt for parsing ingredients but it wasnt super obvious if this is called as part of the previous two, or if there's an independent path for calling that

@dbz10
Copy link
Author

dbz10 commented Mar 16, 2025

Ok I need to do some troubleshooting it seems. Sorry for the premature PR. Will work on it a bit
mealie | TypeError: You tried to pass a `BaseModel` class to `chat.completions.create()`; You must use `beta.chat.completions.parse()` instead

@dbz10 dbz10 marked this pull request as draft March 16, 2025 14:14
@dbz10
Copy link
Author

dbz10 commented Mar 16, 2025

Ok first of all I apologize for the half baked initial PR.
After digging into it a bit more I found that the structured output format is not compatible with optional arguments (see 'all fields must be required' here) meaning that if we want to use this, we need to remove some of the default values in the OpenAI{x} pydantic models.

That being said, with some additional changes + using the beta chat.completions.parse API, I was able to run recipe generation from a screenshot of a recipe.

@michael-genson lmk what you think, whether you think it's still worth it to try using structured outputs given that the blast radius of the changes is a bit larger than what I originally expected.

Screenshot 2025-03-16 at 9 56 06 AM

@dbz10 dbz10 marked this pull request as ready for review March 16, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants