feat: Add experimental integration tests with cached providers #2176

derekhiggins · 2025-05-15T09:59:48Z

What does this PR do?

Introduce a new workflow to run integration and agent tests
against the OpenAI provider, and a subset of inference tests
against Fireworks AI.

Responses from inference providers are cached and reused in
subsequent jobs, reducing API quota usage, speeding up test runs,
and improving reliability by avoiding server-side errors.
(see https://github.com/derekhiggins/cachemeifyoucan for caching code)

The cache is updated on successful job completion.

To prevent the cache from diverging or growing indefinitely,
periodically refresh or manually update the cache by running
this job without a cache(will do as a follow on if this merges).

Any updates to integration tests that change the request to the
provider will use provider quota and update the cache (if they
fail), so need to be careful not to repeatedly run failing PR's

Related: #2017

derekhiggins · 2025-05-15T10:06:33Z

The experimental jobs failed, as expected as there was no cache or tokens configured
https://github.com/meta-llama/llama-stack/actions/runs/15042121708/job/42276084080?pr=2176

See https://github.com/derekhiggins/llama-stack/actions/runs/15042046690/job/42275834971
For an example of this passing, with the cache, without hitting the provider at all

gpt-4o needs a hint to use get_boiling_point for fictional liquids. Signed-off-by: Derek Higgins <[email protected]>

Most of their models don't support it any more as its seen as legacy. Signed-off-by: Derek Higgins <[email protected]>

Introduce a new workflow to run integration and agent tests against the OpenAI provider, and a subset of inference tests against Fireworks AI. Responses from inference providers are cached and reused in subsequent jobs, reducing API quota usage, speeding up test runs, and improving reliability by avoiding server-side errors. (see https://github.com/derekhiggins/cachemeifyoucan for caching code) The cache is updated on successful job completion. To prevent the cache from diverging or growing indefinitely, periodically refresh or manually update the cache by running this job without a cache(will do as a follow on if this merges). Any updates to integration tests that change the request to the provider will use provider quota and update the cache (if they fail), so need to be careful not to repeatedly run failing PR's Signed-off-by: Derek Higgins <[email protected]>

derekhiggins · 2025-06-04T08:27:23Z

Will fail because of lack of credentials (we would need to add these if merging)
Example of new version passing here https://github.com/derekhiggins/llama-stack/actions/runs/15437347769/job/43446706608

derekhiggins requested review from ashwinb, yanxi0830, hardikjshah, dltn, raghotham, dineshyv, vladimirivic, sixianyi0721, ehhuang, terrytangyuan, SLR722 and leseb as code owners May 15, 2025 09:59

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 15, 2025

derekhiggins force-pushed the agent-test branch 3 times, most recently from c859469 to c6f3d04 Compare May 21, 2025 15:29

derekhiggins requested a review from bbrowning as a code owner May 21, 2025 15:29

derekhiggins added 3 commits June 3, 2025 15:22

fix: Add fictional liquids to get_boiling_point

643c0bb

gpt-4o needs a hint to use get_boiling_point for fictional liquids. Signed-off-by: Derek Higgins <[email protected]>

Remove remote::openai from openai_completion support

c4f644a

Most of their models don't support it any more as its seen as legacy. Signed-off-by: Derek Higgins <[email protected]>

derekhiggins force-pushed the agent-test branch from c6f3d04 to db2fb7e Compare June 4, 2025 08:24

derekhiggins mentioned this pull request Jun 24, 2025

Enable fast nightly builds #2353

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add experimental integration tests with cached providers #2176

feat: Add experimental integration tests with cached providers #2176

Uh oh!

derekhiggins commented May 15, 2025

Uh oh!

derekhiggins commented May 15, 2025

Uh oh!

derekhiggins commented Jun 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat: Add experimental integration tests with cached providers #2176

Are you sure you want to change the base?

feat: Add experimental integration tests with cached providers #2176

Uh oh!

Conversation

derekhiggins commented May 15, 2025

What does this PR do?

Uh oh!

derekhiggins commented May 15, 2025

Uh oh!

derekhiggins commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

derekhiggins commented Jun 4, 2025 •

edited

Loading