Skip to content

Cache pre-processing of large documents fully embedded into context #66

Open
@neilmehta24

Description

@neilmehta24

mlx_lm has a cache_prompt and load_prompt feature that makes it easier to work with long prompts. When LM Studio injects an entire document into context, it may take a long time to pre-process the document. This pre-processing will be invalidated when the cache is invalidated. If users have the option to load/save the cache, this pre-processing time would be gone

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions