Standardize RAG evaluation pipelines

Right now some are in python, some in notebooks...
It would be great to use this: https://docs.ragas.io/en/stable/getstarted/index.html to make sure we have the experiments consistently defined. Ideally the code should be reusable between models

It looks like we may be able to reuse the correctness of the answers. Or maybe we can add our own metrics if needed https://docs.ragas.io/en/stable/concepts/metrics/answer_correctness.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize RAG evaluation pipelines #11

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Standardize RAG evaluation pipelines #11

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions