Skip to content

Latest commit

 

History

History
32 lines (25 loc) · 605 Bytes

evaluation.md

File metadata and controls

32 lines (25 loc) · 605 Bytes

AI Evaluation Guide

Running Evaluations

The project uses Evalite for testing AI behavior. Eval files are located in apps/nextjs/evals and follow this pattern:

evalite("Test description", {
  // Test data generator
  data: async () => [
    {
      input: "test input",
      expected: "expected output",
    },
  ],

  // The task to evaluate
  task: async (input) => {
    // Implementation that uses the AI model
    return result;
  },

  // Scoring methods from autoevals
  scorers: [Factuality],
});

To run eval tests:

npm run eval:dev