Skip to content

feat: Add experimental integration tests with cached providers #2176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

derekhiggins
Copy link
Contributor

What does this PR do?

Introduce a new workflow to run integration and agent tests
against the OpenAI provider, and a subset of inference tests
against Fireworks AI.

Responses from inference providers are cached and reused in
subsequent jobs, reducing API quota usage, speeding up test runs,
and improving reliability by avoiding server-side errors.
(see https://github.com/derekhiggins/cachemeifyoucan for caching code)

The cache is updated on successful job completion.

To prevent the cache from diverging or growing indefinitely,
periodically refresh or manually update the cache by running
this job without a cache(will do as a follow on if this merges).

Any updates to integration tests that change the request to the
provider will use provider quota and update the cache (if they
fail), so need to be careful not to repeatedly run failing PR's

Related: #2017

@derekhiggins
Copy link
Contributor Author

The experimental jobs failed, as expected as there was no cache or tokens configured
https://github.com/meta-llama/llama-stack/actions/runs/15042121708/job/42276084080?pr=2176

See https://github.com/derekhiggins/llama-stack/actions/runs/15042046690/job/42275834971
For an example of this passing, with the cache, without hitting the provider at all

@derekhiggins derekhiggins force-pushed the agent-test branch 2 times, most recently from 4f1c7ec to c859469 Compare May 16, 2025 13:18
gpt-4o needs a hint to use get_boiling_point for fictional
liquids.

Signed-off-by: Derek Higgins <[email protected]>
Introduce a new workflow to run integration and agent tests
against the OpenAI provider, and a subset of inference tests
against Fireworks AI.

Responses from inference providers are cached and reused in
subsequent jobs, reducing API quota usage, speeding up test runs,
and improving reliability by avoiding server-side errors.
(see https://github.com/derekhiggins/cachemeifyoucan for caching code)

The cache is updated on successful job completion.

To prevent the cache from diverging or growing indefinitely,
periodically refresh or manually update the cache by running
this job without a cache(will do as a follow on if this merges).

Any updates to integration tests that change the request to the
provider will use provider quota and update the cache (if they
fail), so need to be careful not to repeatedly run failing PR's

Signed-off-by: Derek Higgins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants