Leverage the HF cache for models

### 🚀 The feature, motivation and pitch

torchchat currently uses the hf hub which has it's own model cache, torchchat copies it into it's own model directory so you end up two copies of the same model. 

We should leverage the hf hub cache but not force users to use that location if they're using their own models. 


### Alternatives

_No response_

### Additional context


[From r/localllama ](https://www.reddit.com/r/LocalLLaMA/comments/1eh6xmq/comment/lfzxuer/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)
"One annoying thing is that it uses huggingface_hub for downloading but doesn't use the HF cache - it uses it's own .torchtune folder to store models so you just end up having double of full models (grr). Just use the defaul HF cache location.”

### RFC (Optional)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Leverage the HF cache for models #992

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Leverage the HF cache for models #992

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions