Cache pre-processing of large documents fully embedded into context

`mlx_lm` has a `cache_prompt` and `load_prompt` feature that makes it easier to work with long prompts. When LM Studio injects an entire document into context, it may take a long time to pre-process the document. This pre-processing will be invalidated when the cache is invalidated. If users have the option to load/save the cache, this pre-processing time would be gone