- Added trust code params HF models
- Added LRU cache to HF model param calls to avoid extra calls
- Fixed pydantic type issue HF model return
- Support for Python 3.10-3.11
- Azure model support (completion and chat)
- Google Vertex API model support (completion and chat)
- Streaming responses for LM Completions (set stream=True)
- run with batches now acts the same as async run except not async. We will batch requests into appropriate batchs sizes.
- Refactored client so unified preprocess and postprocess of requests and responses to better support model variants in request/response format.
- _run_chat fixed bug where not passing in kwargs
- Unified run and run_chat methods so it's just run now.
- LLama HF models for eval
- Added chat input for chat models.
- Connection pools to swap between clients
- Chunksize param for async runs
- Determine cache and response by request type, not client name
- Refactor Response to use Pydantic types for Request and Response
- Async support in arun_batch
- Batched runs now caches individual items
- Score prompt does not truncate outside token
- Deprecated chatGPT in favor of openaichat which uses OpenAI completions
- Deprecated Sessions
- Batched inference support in manifest.run. No more separate manifest.run_batch method.
- Standard request base model for all language inputs.
- ChatGPT client. Requires CHATGPT_SESSION_KEY to be passed in.
- Diffusion model support
- Together model support
- Prompt class
- OPT client - OPT is now available in HuggingFace
First major pip release of Manifest. Install via pip install manifest-ml.