Skip to content

Commit 4898970

Browse files
authored
Extend the 5th section with baseline info (#335)
1 parent dcb384e commit 4898970

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

tutorials/guide_evaluation.ipynb

+3-3
Original file line numberDiff line numberDiff line change
@@ -73,11 +73,11 @@
7373
"\n",
7474
"## 5. Running Evaluation\n",
7575
"\n",
76-
"Evaluate your pipeline with different parameters, change the `top_k` value, and try a different embedding model, play with the `temperature` to find what works best for your use case. If you need labeled data for evaluation, you can use some datasets that come with ground-truth documents and ground-truth answers. You can find some datasets on [Hugging Face datasets](https://huggingface.co/datasets) or in the [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation/tree/main/datasets) repository. \n",
76+
"The objective of running evaluations is to measure your pipeline's performance and detect any regressions. To track progress, it is essential to establish baseline metrics using off-the-shelf approaches such as BM25 for keyword retrieval or \"sentence-transformers\" models for embeddings. Then, continue evaluating your pipeline with various parameters: adjust the `top_k` value, experiment with different embedding models, tweak the `temperature`, and benchmark the results to identify what works best for your use case. If labeled data is needed for evaluation, you can use datasets that include ground-truth documents and answers. Such datasets are available on [Hugging Face datasets](https://huggingface.co/datasets) or in the [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation/tree/main/datasets) repository.\n",
7777
"\n",
78-
"Make sure to set up your evaluation environment so that it’s easy to evaluate using different parameters without much hassle. The [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation) repository provides examples with different architectures against various datasets. \n",
78+
"Ensure your evaluation environment is set up to facilitate easy testing with different parameters. The [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation) repository provides examples with various architectures against different datasets.\n",
7979
"\n",
80-
"Read more about how you can optimize your pipeline by trying different parameter combinations in 📚 [Article: Benchmarking Haystack Pipelines for Optimal Performance](https://haystack.deepset.ai/blog/benchmarking-haystack-pipelines)\n",
80+
"For more information on optimizing your pipeline by experimenting with different parameter combinations, refer to 📚 [Article: Benchmarking Haystack Pipelines for Optimal Performance](https://haystack.deepset.ai/blog/benchmarking-haystack-pipelines).\n",
8181
"\n",
8282
"## 6. Analyzing Results\n",
8383
"\n",

0 commit comments

Comments
 (0)