Skip to content

Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level gr.Cache that uses dependency injection#13176

Merged
abidlabs merged 72 commits intomainfrom
cachee
Apr 8, 2026
Merged

Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level gr.Cache that uses dependency injection#13176
abidlabs merged 72 commits intomainfrom
cachee

Conversation

@abidlabs
Copy link
Copy Markdown
Member

@abidlabs abidlabs commented Apr 1, 2026

Adds @gr.cache() decorator for caching all of the kinds of functions we support in Gradio: regular functions, generators, streaming media, async functions, and async generators. Internally, we now use this same logic when we cache examples, to reduce the duplication a bit. Usage:

@gr.cache()
def generate(prompt):
   ....

See demo/cache_demo/run.py for usage.

I also added a UI indicator if caching is used. I'd love to get thoughts on this, but the idea is to use this as an opportunity to increase the visibility of caching, and also to be transparent to users when a result is fetched from the cache instead of generated from scratch. It shows for a second (in the same place as the "minimal" status tracker to avoid obscuring the output) and then fades out

image

Also we expose a lower lower-level cache API based on dependency injection:

def generate(prompt, max_new_tokens, c=gr.Cache()):
    c.set(key, value)
    c.get(key)
    ....

The reasons to use this gr.Cache (instead of a global dictionary for example) are that it supports concurrent usage, supports common eviction policies, and the keys can be non-hashable data types commonly used by Gradio users. See demo/cache_kv_demo/run.py for usage. Also if a developer wants to be "safer", they can make a cache be per session instead of global, just by doing gr.Cache(per_session=True).

One other nice thing is that we can keep track of whether there is a cache hit, and, if there is, still show the cache message, which is just slightly different ("used cache"). Open to feedback on the UI!

image

Also adds tests, 3 demos, and a guide to demonstrate usage.

@gradio-pr-bot
Copy link
Copy Markdown
Collaborator

gradio-pr-bot commented Apr 1, 2026

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website building...
Storybook ready! Storybook preview
🦄 Changes failed! Workflow log

Install Gradio from this PR

pip install https://huggingface.co/buckets/gradio/pypi-previews/resolve/a8606d6622f8f4c8cec3c8b752443f5e9c5b8358/gradio-6.11.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@a8606d6622f8f4c8cec3c8b752443f5e9c5b8358#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-npm-previews.s3.amazonaws.com/a8606d6622f8f4c8cec3c8b752443f5e9c5b8358/gradio-client-2.1.0.tgz

@gradio-pr-bot
Copy link
Copy Markdown
Collaborator

gradio-pr-bot commented Apr 1, 2026

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
@gradio/client minor
@gradio/markdown-code minor
@gradio/statustracker minor
gradio minor

  • Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level gr.Cache that uses dependency injection

‼️ Changeset not approved. Ensure the version bump is appropriate for all packages before approving.

  • Maintainers can approve the changeset by checking this checkbox.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@abidlabs abidlabs changed the title Add (partial) caching with support for storing intermdiate caching to Gradio Add (partial) caching with support for storing intermediate caching state Apr 1, 2026
@abidlabs abidlabs changed the title Add (partial) caching with support for storing intermediate caching state Add @gr.cache() decorator for caching deterministic functions, generators, etc. Apr 2, 2026
@abidlabs abidlabs changed the title Add @gr.cache() decorator for caching deterministic functions, generators, etc. Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level cache API based on dependency injection Apr 2, 2026
abidlabs and others added 2 commits April 6, 2026 10:50
Co-authored-by: hysts <hysts@users.noreply.github.com>
@abidlabs
Copy link
Copy Markdown
Member Author

abidlabs commented Apr 6, 2026

Thanks so much @hysts @freddyaboulton for the great review! Will investigate and fix these issues

@abidlabs
Copy link
Copy Markdown
Member Author

abidlabs commented Apr 6, 2026

One thing I noticed is that cache hit durations seem to be included in the average execution time calculation. So if the first (uncached) call takes 100s, after a few cache hits the displayed average drops to ~50s, ~33s, etc., making it hard to tell what the actual computation cost is. It might be better to exclude cache hits from the average so the "saved time" comparison stays meaningful.

Nice catch, should be fixed now @hysts!

@abidlabs
Copy link
Copy Markdown
Member Author

abidlabs commented Apr 7, 2026

Everything should be fixed now @freddyaboulton @hysts if you can kindly give it another pass!

@abidlabs abidlabs changed the title Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level cache API based on dependency injection Add @gr.cache() decorator for caching deterministic functions, as as well as a lower-level gr.Cache that uses dependency injection Apr 7, 2026
Copy link
Copy Markdown
Collaborator

@hysts hysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! LGTM!

Copy link
Copy Markdown
Collaborator

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes @abidlabs ! Left some comments. Good to merge once those are addressed. I think right now the chatbot won't show if there's been a cache hit because we manually surpress the status tracker but it would be good to maybe show the cache hit since the behavior of chatbot streaming is quite different if there is a cache hit or not. Not blocking though.

@abidlabs
Copy link
Copy Markdown
Member Author

abidlabs commented Apr 8, 2026

Thanks so much for the careful review @freddyaboulton and @hysts! Will merge this in once CI passes

@abidlabs abidlabs merged commit 45c4ecd into main Apr 8, 2026
1 check passed
@abidlabs abidlabs deleted the cachee branch April 8, 2026 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants