You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Whether you're building AI agents, RAG pipelines, or chatbots, implemented via LangChain or OpenAI, DeepEval has you covered. With it, you can easily determine the optimal models, prompts, and architecture to improve your AI quality, prevent prompt drifting, or even transition from OpenAI to Claude with confidence.
59
59
60
60
> [!IMPORTANT]
61
-
> Need a place for your DeepEval testing data to live 🏡❤️? [Sign up to the DeepEval platform](https://confident-ai.com?utm_source=GitHub) to compare iterations of your LLM app, generate & share testing reports, and more.
61
+
> Need a place for your DeepEval testing data to live 🏡❤️? [Sign up to the DeepEval platform](https://www.confident-ai.com?utm_source=deepeval&utm_medium=github&utm_content=signup_callout) to compare iterations of your LLM app, generate & share testing reports, and more.
62
62
>
63
63
> 
64
64
@@ -171,7 +171,7 @@ DeepEval plugs into any LLM framework — OpenAI Agents, LangChain, CrewAI, and
171
171
172
172
## ☁️ Platform + Ecosystem
173
173
174
-
[Confident AI](https://confident-ai.com?utm_source=GitHub) is an all-in-one platform that integrates natively with DeepEval.
174
+
[Confident AI](https://www.confident-ai.com?utm_source=deepeval&utm_medium=github&utm_content=platform_section) is an all-in-one platform that integrates natively with DeepEval.
175
175
176
176
- Manage datasets, trace LLM applications, run evaluations, and monitor responses in production — all from one platform.
177
177
- Don't need a UI? Confident AI can also be your data persistant layer - run evals, pull datasets, and inspect traces straight from claude code, cursor, via Confident AI's [MCP server](https://github.com/confident-ai/confident-mcp-server).
@@ -220,13 +220,13 @@ Open `test_chatbot.py` and write your first test case to run an **end-to-end** e
220
220
import pytest
221
221
from deepeval import assert_test
222
222
from deepeval.metrics import GEval
223
-
from deepeval.test_case import LLMTestCase, LLMTestCaseParams
223
+
from deepeval.test_case import LLMTestCase, SingleTurnParams
224
224
225
225
deftest_case():
226
226
correctness_metric = GEval(
227
227
name="Correctness",
228
228
criteria="Determine if the 'actual output' is correct based on the 'expected output'.",
[Confident AI](https://confident-ai.com?utm_source=GitHub) is an all-in-one platform to manage datasets, trace LLM applications, and run evaluations in production. Log in from the CLI to get started:
400
+
[Confident AI](https://www.confident-ai.com?utm_source=deepeval&utm_medium=github&utm_content=cli_login_section) is an all-in-one platform to manage datasets, trace LLM applications, and run evaluations in production. Log in from the CLI to get started:
401
401
402
402
```bash
403
403
deepeval login
@@ -417,7 +417,7 @@ Prefer to stay in your IDE? Use DeepEval via [Confident AI's MCP server](https:/
417
417
<imgsrc="assets/confident-mcp-architecture.png"alt="Confident AI MCP Architecture"width="500">
418
418
</p>
419
419
420
-
Everything on Confident AI is available [here](https://www.confident-ai.com/docs?utm_source=GitHub).
420
+
Everything on Confident AI is available [here](https://www.confident-ai.com/docs?utm_source=deepeval&utm_medium=github&utm_content=cloud_docs).
0 commit comments