In this module, we will learn how to evaluate and monitor our LLM and RAG system.
In the evaluation part, we assess the quality of our entire RAG system before it goes live.
In the monitoring part, we collect, store and visualize metrics to assess the answer quality of a deployed LLM. We also collect chat history and user feedback.
- Phoenix (give it a star!)
- Pre-Built Evals
- Retrieval (RAG) Relevance
- Phoenix Community Slack
- Notes from 2024 edition
- Did you take notes? Add them above this line (Send a PR with links to your notes)