Release Unitxt 1.25.0 - Improved Error Messages · IBM/unitxt

Main changes

Error message simplied and improved. Now each failue produces a short stack trace, following by the context the error occured, a link to help documention, and then the detailed error message

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 🦄 Unitxt Error Context                                                                                              │
│ -------------------------------------------------------------------------------------------------------------------- │
│  - Python: 3.10.17                                                                                                   │
│  - Unitxt: 1.25.0                                                                                                    │
│  - Stage: Metric Processing                                                                                          │
│  - Stream: all_data>>                                                                                                │
│  - Object: KeyValueExtraction (https://www.unitxt.ai/en/latest/unitxt.metrics.html#unitxt.metrics.KeyValueExtraction)│
│  - Help: https://www.unitxt.ai/en/latest/docs/adding_metric.html                                                     │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Each reference is expected to be of type 'Dict[str, str]' in metrics.key_value_extraction.accuracy metric. Received reference of type <class 'str'>: Austin

Added Granite Thinking support including example.
Added a flag in the format to determine whether the to place the template instructions once in the system turn, or in the user turns (for each demo and for the final input). This is important because some models delete their default system prompt, when their recieve an external system prompt.
Added option to get generated text in meta data when calling infer_log_prob() . In the past only seperated tokens were returned.
See example code.
Added support for multi turn dialog metrics. See tool calling example.

What's Changed

Add Multi Turn Metrics Support by @elronbandel in #1579
add a test for faithfulness with an external client and fetch artifact by @matanor in #1824
Fix rits model names and the judges that use them by @martinscooper in #1825
Add option to store template instruction in user role and not system role and added granite thinking example by @yoavkatz in #1667
Bluebench fixes by @bnayahu in #1828
Fix huggingface auto model log probs evaluation by @elronbandel in #1829
Add support for tool calling in HFAutoModelInferenceEngine by @elronbandel in #1827
Changed artifiact.to_yaml() to use standard dict to yaml API + added example to create a yaml representation of a data card by @yoavkatz in #1831
Fix CLI issues by @bnayahu in #1832
Allow changing default ollama api_base by @martinscooper in #1830
fix bug when WML does not return any content or tool call by @yoavkatz in #1835
Arena hard fix by @bnayahu in #1836
Add full generated text when running infer_log_prob() with meta data enabled. by @yoavkatz in #1834
Improved parsing of MT bench style scores by @yoavkatz in #1839
Use os.path.join to create infer cache dir path by @martinscooper in #1840
Add multi turn tool calling task and support multiple tools per call by @elronbandel in #1811
Results summarization utility for the CLI by @bnayahu in #1842
Improved error messages by @elronbandel in #1838
Improve Text2SQL Metrics: Refactoring, New Execution Metric, and Bug Fixes by @oktie in #1841
Update coverage exclusions by @elronbandel in #1843
Use full artifact representation as the cache key for the dataset by @elronbandel in #1644
Update version to 1.25.0 by @elronbandel in #1844

Full Changelog: 1.24.0...1.25.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unitxt 1.25.0 - Improved Error Messages

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Main changes

What's Changed

Contributors

Uh oh!