Skip to content

Update for NeMo Evaluator 25.03#369

Merged
shivamerla merged 1 commit intoNVIDIA:mainfrom
shengnuo:nmp-25-03-evaluator
Mar 20, 2025
Merged

Update for NeMo Evaluator 25.03#369
shivamerla merged 1 commit intoNVIDIA:mainfrom
shengnuo:nmp-25-03-evaluator

Conversation

@shengnuo
Copy link
Copy Markdown
Collaborator

@shengnuo shengnuo commented Mar 14, 2025

  • add evaluation images at .spec.evaluationImages to NeMo Evaluator
  • add NeMo entity store endpoint at .spec.entitystore to NeMo Evaluator
  • rename DATA_STORE_HOST environment variable to DATA_STORE_URL

Tests

  1. Create an evaluation target that points to the llama-3.1-8b-instruct NIMService
res = requests.post(
    url="http://nemoevaluator-sample.nemo.svc.cluster.local:7331/v1/evaluation/targets",
    json={
        "type": "model",
        "name": "foo",
        "namespace": "my-organization",
        "model": {
            "api_endpoint": {
                "url": "http://meta-llama3-8b-instruct.nemo.svc.cluster.local:8000/v1/completions",
                "model_id": "meta/llama-3.1-8b-instruct"
            }
        }
    }
)
  1. Create a custom evaluation config
simple_eval_config = {
    "type": "custom",
    "params": {"parallelism": 8},
    "tasks": {
        "qa": {
            "type": "completion",
            "params": {
                "template": {
                    "prompt": "{{prompt}}",
                    "max_tokens": 20,
                    "temperature": 0.7,
                    "top_p": 0.9,
                },
            },
            "dataset": {"files_url": f"hf://datasets/{repo_id}/testing/testing.jsonl"},
            "metrics": {
                "bleu": {
                    "type": "bleu",
                    "params": {"references": ["{{ideal_response}}"]},
                },
                "string-check": {
                    "type": "string-check",
                    "params": {"check": ["{{ideal_response | trim}}", "equals", "{{output_text | trim}}"]},
                },
            },
        }
    },
}
  1. Create an evaluation job for the above target and configuration
>> res = requests.post(
    f"http://nemoevaluator-sample.nemo.svc.cluster.local:7331/v1/evaluation/jobs",
    json={
        "config": simple_eval_config,
        "target": "my-organization/foo",
    },
)
>> res.json()["id"]
eval-NgZ8AAbQvfihVsvy3Pn1Pv
  1. After few minutes, check if the evaluation job is completed
$ curl "http://${EVALUATOR_HOSTNAME}/v1/evaluation/jobs/eval-NgZ8AAbQvfihVsvy3Pn1Pv/status"
{"message":null,"task_status":{"qa":"completed"},"progress":100.0}
  1. Download the evaluation result
$ curl -X 'GET' \
"http://${EVALUATOR_HOSTNAME}/v1/evaluation/jobs/eval-NgZ8AAbQvfihVsvy3Pn1Pv/download-results" \
-H 'accept: application/json' \
-o result.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  474k  100  474k    0     0  3313k      0 --:--:-- --:--:-- --:--:-- 3321k

$ echo $?
0

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 14, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Copy Markdown
Collaborator

@shivamerla shivamerla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the initial review

@shengnuo shengnuo changed the title DRAFT: Update for NeMo Evaluator 25.03 Update for NeMo Evaluator 25.03 Mar 18, 2025
@shengnuo shengnuo force-pushed the nmp-25-03-evaluator branch 4 times, most recently from 46897ee to 6d46598 Compare March 19, 2025 00:06
Comment thread go.mod Outdated
Comment thread kubectl Outdated
@shengnuo shengnuo force-pushed the nmp-25-03-evaluator branch 2 times, most recently from 6b80f54 to 45f3f10 Compare March 20, 2025 15:54
Signed-off-by: Sheng Lin <shelin@nvidia.com>
@shivamerla shivamerla force-pushed the nmp-25-03-evaluator branch from 45f3f10 to 7892a98 Compare March 20, 2025 16:13
@shivamerla shivamerla merged commit 2a729ed into NVIDIA:main Mar 20, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants