Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,42 @@ steps:
- Summarize the changes made to {{File.basename(file)}}.
```

## Try it

If you don’t have one already, get an OpenAI key from [here](https://platform.openai.com/settings/organization/api-keys). You will need an account with a credit card, make sure that a basic completion works.

```bash
export OPENAI_API_KEY=sk-proj-....

curl -H "Content-Type: application/json" \
-H "Authorization: Bearer $API_TOKEN" \
-d '{"model":"gpt-4.1-mini","messages":[{"role":"user","content":"What is 1+1?"}]}' \
https://api.openai.com/v1/chat/completions
```

The [test grading workflow](examples/grading/workflow.md) in this repository is a senior software engineer and testing expert that evaluates the quality of a test based on guidelines.

Try the workflow.

```bash
./exe/roast execute examples/grading/workflow.yml test/roast/resources_test.rb

🔥🔥🔥 Everyone loves a good roast 🔥🔥🔥
...
```

This will output a test grade.

```
========== TEST GRADE REPORT ==========
Test file: test/roast/resources_test.rb

FINAL GRADE:
Score: 80/100
Letter Grade: B
```
Note that you may also need `shadowenv` and `rg`, on MacOS run `brew install shadowenv` and `brew install rg`.

## How to use Roast

1. Create a workflow YAML file defining your steps and tools
Expand Down
10 changes: 6 additions & 4 deletions examples/grading/workflow.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
name: Test Grading
model: anthropic:claude-opus-4
api_token: $(echo $OPENAI_API_KEY)
# model: anthropic:claude-opus-4
model: gpt-4.1-mini

tools:
- Roast::Tools::Grep
Expand All @@ -23,16 +25,16 @@ steps:

# set non-default attributes for steps below
analyze_coverage:
model: gpt-4.1-mini
# model: gpt-4.1-mini
auto_loop: false
json: true

generate_grades:
model: o3
# model: o3
json: true

generate_recommendations:
model: o3
# model: o3
auto_loop: false
json: true
params:
Expand Down