Unitxt 1.25.0 - Improved Error Messages
Main changes
- Error message simplied and improved. Now each failue produces a short stack trace, following by the context the error occured, a link to help documention, and then the detailed error message
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 🦄 Unitxt Error Context │
│ -------------------------------------------------------------------------------------------------------------------- │
│ - Python: 3.10.17 │
│ - Unitxt: 1.25.0 │
│ - Stage: Metric Processing │
│ - Stream: all_data>> │
│ - Object: KeyValueExtraction (https://www.unitxt.ai/en/latest/unitxt.metrics.html#unitxt.metrics.KeyValueExtraction)│
│ - Help: https://www.unitxt.ai/en/latest/docs/adding_metric.html │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Each reference is expected to be of type 'Dict[str, str]' in metrics.key_value_extraction.accuracy metric. Received reference of type <class 'str'>: Austin
-
Added Granite Thinking support including example.
-
Added a flag in the format to determine whether the to place the template instructions once in the system turn, or in the user turns (for each demo and for the final input). This is important because some models delete their default system prompt, when their recieve an external system prompt.
-
Added option to get generated text in meta data when calling infer_log_prob() . In the past only seperated tokens were returned.
See example code. -
Added support for multi turn dialog metrics. See tool calling example.
What's Changed
- Add Multi Turn Metrics Support by @elronbandel in #1579
- add a test for faithfulness with an external client and fetch artifact by @matanor in #1824
- Fix rits model names and the judges that use them by @martinscooper in #1825
- Add option to store template instruction in user role and not system role and added granite thinking example by @yoavkatz in #1667
- Bluebench fixes by @bnayahu in #1828
- Fix huggingface auto model log probs evaluation by @elronbandel in #1829
- Add support for tool calling in HFAutoModelInferenceEngine by @elronbandel in #1827
- Changed artifiact.to_yaml() to use standard dict to yaml API + added example to create a yaml representation of a data card by @yoavkatz in #1831
- Fix CLI issues by @bnayahu in #1832
- Allow changing default ollama api_base by @martinscooper in #1830
- fix bug when WML does not return any content or tool call by @yoavkatz in #1835
- Arena hard fix by @bnayahu in #1836
- Add full generated text when running infer_log_prob() with meta data enabled. by @yoavkatz in #1834
- Improved parsing of MT bench style scores by @yoavkatz in #1839
- Use os.path.join to create infer cache dir path by @martinscooper in #1840
- Add multi turn tool calling task and support multiple tools per call by @elronbandel in #1811
- Results summarization utility for the CLI by @bnayahu in #1842
- Improved error messages by @elronbandel in #1838
- Improve Text2SQL Metrics: Refactoring, New Execution Metric, and Bug Fixes by @oktie in #1841
- Update coverage exclusions by @elronbandel in #1843
- Use full artifact representation as the cache key for the dataset by @elronbandel in #1644
- Update version to 1.25.0 by @elronbandel in #1844
Full Changelog: 1.24.0...1.25.0