Extend the 5th section with baseline info (#335)

bilgeyucel · web-flow · commit 489897071206 · 2024-07-19T14:47:47.000+03:00
diff --git a/tutorials/guide_evaluation.ipynb b/tutorials/guide_evaluation.ipynb
@@ -73,11 +73,11 @@
     "\n",
     "## 5. Running Evaluation\n",
     "\n",
-    "Evaluate your pipeline with different parameters, change the `top_k` value, and try a different embedding model, play with the `temperature` to find what works best for your use case. If you need labeled data for evaluation, you can use some datasets that come with ground-truth documents and ground-truth answers. You can find some datasets on [Hugging Face datasets](https://huggingface.co/datasets) or in the [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation/tree/main/datasets) repository. \n",
+    "The objective of running evaluations is to measure your pipeline's performance and detect any regressions. To track progress, it is essential to establish baseline metrics using off-the-shelf approaches such as BM25 for keyword retrieval or \"sentence-transformers\" models for embeddings. Then, continue evaluating your pipeline with various parameters: adjust the `top_k` value, experiment with different embedding models, tweak the `temperature`, and benchmark the results to identify what works best for your use case. If labeled data is needed for evaluation, you can use datasets that include ground-truth documents and answers. Such datasets are available on [Hugging Face datasets](https://huggingface.co/datasets) or in the [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation/tree/main/datasets) repository.\n",
     "\n",
-    "Make sure to set up your evaluation environment so that it’s easy to evaluate using different parameters without much hassle. The [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation) repository provides examples with different architectures against various datasets. \n",
+    "Ensure your evaluation environment is set up to facilitate easy testing with different parameters. The [haystack-evaluation](https://github.com/deepset-ai/haystack-evaluation) repository provides examples with various architectures against different datasets.\n",
     "\n",
-    "Read more about how you can optimize your pipeline by trying different parameter combinations in 📚 [Article: Benchmarking Haystack Pipelines for Optimal Performance](https://haystack.deepset.ai/blog/benchmarking-haystack-pipelines)\n",
+    "For more information on optimizing your pipeline by experimenting with different parameter combinations, refer to 📚 [Article: Benchmarking Haystack Pipelines for Optimal Performance](https://haystack.deepset.ai/blog/benchmarking-haystack-pipelines).\n",
     "\n",
     "## 6. Analyzing Results\n",
     "\n",