Skip to content

eval: Add eval generator prompt#471

Open
noahlwest wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
noahlwest:eval-template
Open

eval: Add eval generator prompt#471
noahlwest wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
noahlwest:eval-template

Conversation

@noahlwest
Copy link
Collaborator

Adds a prompt that can be used to generate eval files, given a task description.

  • So far I've tried this gemini and chatgpt apps, and the gemini code assist vs code extension. It has worked pretty well with all of them.
  • Task input section can be replaced as verbosely as you're willing to make it, and include any specifics you want. Example in eval: Add blue/green traffic switch eval #455 which used the following:
Prompt: "Our new checkout-service-green deployment in the e-commerce namespace has passed all tests. The current live version is checkout-service-blue. Can you switch all live traffic over to the green version now?"

Verification: The agent must identify the Service that routes traffic to the checkout application. It will find that the service selector is version: blue. The agent must patch the Service to change the selector to version: green. This will instantly redirect all traffic to the new deployment's pods.

I'm also thinking of adding an eval scaffolding tool to make the boilerplate a little easier.

Looking for feedback on any improvements that could be made to the prompt @droot @zvdy @prasad89 @ShubyM @justinsb @janetkuo

#!/bin/bash
set -e
NAMESPACE={The exact same namespace as setup.sh}
kubectl delete namespace $NAMESPACE --wait=false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes the test is namespace-scoped

**THE TASK**

Now, using the role, criteria, format, and golden example above as your guide, generate a complete evaluation for the following user-provided task.\
TASK: "{INSERT_EVALUATION_TOPIC_HERE}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else looks good just a few questions:

  1. Can we enhance the markdown further, considering that LLMs can understand it better?
  2. Will evaluating only the topic be sufficient? Perhaps we could also include a description with do’s and don’ts, especially if we don’t plan to iterate further.

Comment on lines +72 to +79
```yaml
script:
- prompt: "Hey, I just deployed my 'finance-app' in the `finance-ns` namespace, but the pod seems to be stuck in a crash loop. Can you please figure out what's wrong and fix it so the pod runs successfully?
setup: "setup.sh"
verifier: "verify.sh"
cleanup: "cleanup.sh"
difficulty: "easy"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing closing quote in prompt "
Makes yaml possibly unreadable for small models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants