GEval Typescript Implementation

Overview

The GEval API provides a method for evaluating model-generated outputs using the GEval framework. It assesses how well the model's output matches expected results based on custom criteria, evaluation steps, and other optional parameters.

G-Eval paper: https://arxiv.org/abs/2303.16634

G-Eval Python Implementation: https://github.com/confident-ai/deepeval

API Endpoint

POST /api/geval

This endpoint evaluates a test case based on input, actual output, criteria, and optional evaluation steps, context, and retrieval context. It uses OpenAI models to calculate a score for the evaluation and provides an explanation for the assigned score.

Request Format

The API expects a JSON payload with the following fields:

Required Fields

name: (String) The name of the evaluation or test case. Example: "order_relevance".
input: (String) The input or question given to the model. Example: "Python course roadmap for beginners first module".
actualOutput: (String) The actual output generated by the model. Example: "- module 1: python basics".

Optional Fields

criteria: (String) (Optional) The criteria based on which the output is evaluated. Example: "check if the course has the correct order for the intended audience".
expectedOutput: (String) (Optional) The expected output for comparison. Example: "module 1: python basics".
evaluationSteps: (Array of Strings) (Optional) A list of step-by-step evaluation criteria. Example: ["Verify the order of modules", "Check if topics follow a logical progression for beginners"].
context: (String) (Optional) Additional context or background information. Example: "Python is a fundamental programming language, and a roadmap for beginners should start with basics.".
retrievalContext: (String) (Optional) Additional information from retrieval context. Example: "Python is often taught with a clear progression from basics to advanced topics.".

Example Request (with `curl`)

curl -X POST http://localhost:3001/api/geval \
-H "Content-Type: application/json" \
-d '{
  "name": "order_relevance",
  "input": "Python course roadmap for beginners first module",
  "actualOutput": "- module 1: python basics",
  "criteria": "check if the course has correct order for the required audience",
  "evaluationSteps": [
    "Ensure the order of topics follows a logical progression",
    "Check if the content is appropriate for beginners"
  ]
}'

Example Request (with `expectedOutput`)

curl -X POST http://localhost:3001/api/geval \
-H "Content-Type: application/json" \
-d '{
  "name": "output_accuracy",
  "input": "What is the capital of France?",
  "actualOutput": "The capital of France is Paris.",
  "expectedOutput": "Paris",
  "evaluationSteps": [
    "Verify if the actual output matches the expected output",
    "Check if the response provides accurate information"
  ],
  "context": "Paris is the capital of France."
}'

Response Format

The API will respond with a JSON object containing the evaluation results. The fields include:

score: (Number) The evaluation score, ranging from 0 to 1.
reason: (String) A concise explanation for the assigned score.

Example Response

{
  "score": 0.9,
  "reason": "The output is relevant and follows the expected order for beginners."
}

Parameters Summary

Parameter	Type	Required	Description
name	String	Yes	Name of the evaluation or test case. Example: `"order_relevance"`.
input	String	Yes	The input given to the model. Example: `"Python course roadmap for beginners first module"`.
actualOutput	String	Yes	The actual output generated by the model. Example: `"- module 1: python basics"`.
criteria	String	No	Specific criteria for evaluation. Example: `"check if the course has correct order for beginners"`.
expectedOutput	String	No	The expected output for comparison. Example: `"module 1: python basics"`.
evaluationSteps	Array of Strings	No	Step-by-step criteria for evaluation. Example: `["Verify the order of modules", "Check progression"]`.
context	String	No	Additional background information for evaluation. Example: `"Python is a fundamental language"`.
retrievalContext	String	No	Retrieval context information. Example: `"Python is often taught progressively from basics"`.

Response Summary

Field	Type	Description
score	Number	The final evaluation score, between 0 and 1.
reason	String	A concise explanation of the evaluation score and key observations.

By using this API, you can easily automate and perform evaluations of model-generated outputs based on specific criteria and custom evaluation steps.

Author

Founder of Pype, [email protected]

License

Licensed under the Apache License, Version 2.0; you may not use this file except in compliance with the License. You may obtain a copy of the License at (LICENSE. Portions of this project are derived from [Deepeval], available at [https://github.com/confident-ai/deepeval/], licensed under the Apache License, Version 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GEval Typescript Implementation

Overview

API Endpoint

Request Format

Required Fields

Optional Fields

Example Request (with `curl`)

Example Request (with `expectedOutput`)

Response Format

Example Response

Parameters Summary

Response Summary

Author

License

About

Uh oh!

Releases

Packages

Languages

ashish-tripathy-5/g-eval-nextjs

Folders and files

Latest commit

History

Repository files navigation

GEval Typescript Implementation

Overview

API Endpoint

Request Format

Required Fields

Optional Fields

Example Request (with curl)

Example Request (with expectedOutput)

Response Format

Example Response

Parameters Summary

Response Summary

Author

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Example Request (with `curl`)

Example Request (with `expectedOutput`)

Packages