Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ For the image scope, the program takes up to two files, depending on the prompt
|----------------------|-------------------------------------------------------------------|----------|
| `--submission_type` | Type of submission (from `arg_options.FileType`) | ❌ |
| `--prompt` | Pre-defined prompt name or file path to custom prompt file | ❌ **|
| `--prompt_text` | Additional string text prompt that can be fed to model. | ❌ ** |
| `--prompt_text` | Additional string text prompt that can be fed to model or standalone prompt. | ❌ ** |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to update the documentation here, but let's split this up. (1) here, just say "String prompt"; (2) expanding on the ** note below, say that if both --prompt and --prompt_text are provided, the prompt text argument is appended to the file contents (or something to that effect).

Similar comment for the help message as well.

| `--scope` | Processing scope (`image` or `code` or `text`) | ✅ |
| `--submission` | Submission file path | ✅ |
| `--question` | Specific question to evaluate | ❌ |
Expand Down
3 changes: 2 additions & 1 deletion ai_feedback/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ def main() -> int:
args.submission_type = detect_submission_type(args.submission)

prompt_content = ""

prompt = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these changes are necessary. The only place the variable prompt is used is in the image scope, and that branch already defines prompt explicitly.

system_instructions = load_system_prompt_content(args.system_prompt)

if args.prompt:
Expand All @@ -236,6 +236,7 @@ def main() -> int:

if args.prompt_text:
prompt_content += args.prompt_text
prompt["prompt_content"] = prompt_content

if args.scope == "image":
prompt = {"prompt_content": prompt_content}
Expand Down
2 changes: 1 addition & 1 deletion ai_feedback/helpers/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
HELP_MESSAGES = {
"submission_type": "The format of the submission file (e.g., Jupyter notebook, Python script).",
"prompt": "Pre-defined prompt name (from ai_feedback/data/prompts/user/) or file path to custom prompt file.",
"prompt_text": "Additional messages to concatenate to the prompt.",
"prompt_text": "Optional standalone prompt or additional text to append to the base prompt.",
"scope": "The section of the assignment the model should analyze (e.g., code or image).",
"submission": "The file path for the submission file.",
"solution": "The file path for the solution file.",
Expand Down
4 changes: 2 additions & 2 deletions promptfoo/promptfoo_test_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ def call_api(prompt: str, context: dict, metadata: dict) -> dict:
options["scope"],
"--model",
options["model"],
"--prompt",
options['prompt'],
"--prompt_text",
prompt,
"--llama_mode",
"server",
"--output_template",
Expand Down
112 changes: 56 additions & 56 deletions promptfoo/tests/codellama_tests/codellama_code_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,59 +6,59 @@ defaultTest:
model: codellama:latest
scope: code

scenarios:
- config:
- vars: { prompt: code_overall }
tests:
- vars:
submission_file: test_submissions/csc263_opt_connected/correct_submission/correct_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/fail_submission/fail_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py


- vars:
submission_file: test_submissions/csc263_opt_connected/incorrect_algo_submission/incorrect_algo_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/style_issues_submission/style_issues_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc108/correct_submission/correct_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_1_submission/correctness_1_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_2_submission/correctness_2_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/efficiency_submission/efficiency_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/style_submission/style_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/gac_example/correct_submission/correct_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/fail_submission/fail_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/inefficient_submission/inefficient_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/partial_correct_submission/partial_correct_submission.py
solution_file: test_submissions/gac_example/solution.py
prompts:
- file://../../../ai_feedback/data/prompts/user/code_overall.md

tests:
- vars:
submission_file: test_submissions/csc263_opt_connected/correct_submission/correct_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/fail_submission/fail_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py


- vars:
submission_file: test_submissions/csc263_opt_connected/incorrect_algo_submission/incorrect_algo_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/style_issues_submission/style_issues_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc108/correct_submission/correct_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_1_submission/correctness_1_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_2_submission/correctness_2_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/efficiency_submission/efficiency_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/style_submission/style_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/gac_example/correct_submission/correct_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/fail_submission/fail_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/inefficient_submission/inefficient_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/partial_correct_submission/partial_correct_submission.py
solution_file: test_submissions/gac_example/solution.py
110 changes: 55 additions & 55 deletions promptfoo/tests/deepseek_r1_tests/deepseek_r1_code_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,58 +6,58 @@ defaultTest:
model: deepSeek-R1:70B
scope: code

scenarios:
- config:
- vars: { prompt: code_feedback_r1 }
tests:
- vars:
submission_file: test_submissions/csc263_opt_connected/correct_submission/correct_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/fail_submission/fail_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/incorrect_algo_submission/incorrect_algo_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/style_issues_submission/style_issues_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc108/correct_submission/correct_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_1_submission/correctness_1_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_2_submission/correctness_2_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/efficiency_submission/efficiency_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/style_submission/style_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/gac_example/correct_submission/correct_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/fail_submission/fail_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/inefficient_submission/inefficient_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/partial_correct_submission/partial_correct_submission.py
solution_file: test_submissions/gac_example/solution.py
prompts:
- file://../../../ai_feedback/data/prompts/user/code_feedback_r1.md

tests:
- vars:
submission_file: test_submissions/csc263_opt_connected/correct_submission/correct_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/fail_submission/fail_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/incorrect_algo_submission/incorrect_algo_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc263_opt_connected/style_issues_submission/style_issues_submission.py
solution_file: test_submissions/csc263_opt_connected/solution.py

- vars:
submission_file: test_submissions/csc108/correct_submission/correct_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_1_submission/correctness_1_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/correctness_2_submission/correctness_2_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/efficiency_submission/efficiency_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/csc108/style_submission/style_submission.py
solution_file: test_submissions/csc108/solution.py

- vars:
submission_file: test_submissions/gac_example/correct_submission/correct_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/fail_submission/fail_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/inefficient_submission/inefficient_submission.py
solution_file: test_submissions/gac_example/solution.py

- vars:
submission_file: test_submissions/gac_example/partial_correct_submission/partial_correct_submission.py
solution_file: test_submissions/gac_example/solution.py
62 changes: 31 additions & 31 deletions promptfoo/tests/deepseek_r1_tests/deepseek_r1_text_tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,34 +6,34 @@ defaultTest:
model: deepSeek-R1:70B
scope: text

scenarios:
- config:
- vars: { prompt: text_analyze_r1 }
tests:
- vars:
submission_file: test_submissions/data_collection_ethics_module/average_submission/average_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/excellent_submission/excellent_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/off_topic_submission/off_topic_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/weak_submission/weak_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/fail_submission/fail_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/incomplete_submission/incomplete_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/induction_submission/induction_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf
prompts:
- file://../../../ai_feedback/data/prompts/user/text_analyze_r1.md

tests:
- vars:
submission_file: test_submissions/data_collection_ethics_module/average_submission/average_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/excellent_submission/excellent_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/off_topic_submission/off_topic_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/data_collection_ethics_module/weak_submission/weak_submission.txt
solution_file: test_submissions/data_collection_ethics_module/solution.txt

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/fail_submission/fail_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/incomplete_submission/incomplete_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf

- vars:
submission_file: test_submissions/csc373_eft_optimality_proof/induction_submission/induction_submission.pdf
solution_file: test_submissions/csc373_eft_optimality_proof/solution.pdf
Loading