Skip to content

Commit f24a1e0

Browse files
author
Will Kukkamalla
committed
Merge branch 'add-json-schema' of github.com:wkukka1/ai-autograding-feedback into add-json-schema
2 parents 1fa1146 + 64df9d5 commit f24a1e0

15 files changed

+172
-76
lines changed

README.md

Lines changed: 30 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,8 @@ For the image scope, the program takes up to two files, depending on the prompt
2929
| Argument | Description | Required |
3030
|----------------------|-------------------------------------------------------------------|----------|
3131
| `--submission_type` | Type of submission (from `arg_options.FileType`) ||
32-
| `--prompt` | The name of a preddefined prompt file (from `arg_options.Prompt`) |**|
32+
| `--prompt` | Pre-defined prompt name or file path to custom prompt file |**|
3333
| `--prompt_text` | Additional string text prompt that can be fed to model. |** |
34-
| `--prompt_custom` | The name of prompt file uploaded to be used by model. |** |
3534
| `--scope` | Processing scope (`image` or `code` or `text`) ||
3635
| `--submission` | Submission file path ||
3736
| `--question` | Specific question to evaluate ||
@@ -41,11 +40,11 @@ For the image scope, the program takes up to two files, depending on the prompt
4140
| `--test_output` | File path for the file containing the results from tests ||
4241
| `--submission_image` | File path for the submission image file ||
4342
| `--solution_image` | File path for the solution image file ||
44-
| `--system_prompt` | File path for the system instructions prompt ||
43+
| `--system_prompt` | Pre-defined system prompt name or file path to custom system prompt ||
4544
| `--llama_mode` | How to invoke deepSeek-v3 (choices in `arg_options.LlamaMode`) ||
4645
| `--output_template` | Output template file (from `arg_options.OutputTemplate) ||
4746
| `--json_schema` | File path to json file for schema for structured output ||
48-
** One of either prompt, prompt_custom, or prompt_text must be selected.
47+
** One of either `--prompt` or `--prompt_text` must be selected.
4948

5049
## Scope
5150
The program supports three scopes: code or text or image. Depending on which is selected, the program supports different models and prompts tailored for each option.
@@ -67,8 +66,15 @@ The user can also explicitly specify the submission type using the `--submission
6766
Currently, jupyter notebook, pdf, and python assignments are supported.
6867

6968
## Prompts
70-
The user can use this argument to specify which predefined prompt they wish the model to use.
71-
To view the predefined prompts, navigate to the ai_feedback/data/prompts/user folder. Each prompt is stored as a markdown file that can contain template placeholders with the following structure:
69+
The `--prompt` argument accepts either pre-defined prompt names or custom file paths:
70+
71+
### Pre-defined Prompts
72+
To use pre-defined prompts, specify the prompt name (without extension). Pre-defined prompts are stored as markdown (.md) files in the `ai_feedback/data/prompts/user/` directory.
73+
74+
### Custom Prompt Files
75+
To use custom prompt files, specify the file path to your custom prompt. The file should be a markdown (.md) file.
76+
77+
Prompt files can contain template placeholders with the following structure:
7278

7379
```markdown
7480
Consider this question:
@@ -86,7 +92,7 @@ Prompt Naming Conventions:
8692
- Prompts to be used when --scope image is selected are prefixed with image_{}.md
8793
- Prompts to be used when --scope text is selected are prefixed with text_{}.md
8894

89-
If the --scope argument is provided and its value does not match the prefix of the selected --prompt, an error message will be displayed.
95+
Scope validation (prefix matching) only applies to pre-defined prompts. Custom prompt files can be used with any scope.
9096

9197
All prompts are treated as templates that can contain special placeholder blocks, the following template placeholders are automatically replaced:
9298
- `{context}` - Question context
@@ -123,8 +129,16 @@ All prompts are treated as templates that can contain special placeholder blocks
123129
## Prompt_text
124130
Additonally, the user can pass in a string through the --prompt_text argument. This will either be concatenated to the prompt if --prompt is used or fed in as the only prompt if --prompt is not used.
125131

126-
## Prompt_custom
127-
The user can pass in their own custom prompt file and use the --prompt_custom argument to flag that the model should use the custom prompt. This can be used instead of choosing one of the predefined prompts.
132+
## System Prompts
133+
The `--system_prompt` argument accepts either pre-defined system prompt names or custom file paths:
134+
135+
### Pre-defined System Prompts
136+
To use pre-defined system prompts, specify the system prompt name (without extension). Pre-defined system prompts are stored as markdown (.md) files in the `ai_feedback/data/prompts/system/` directory.
137+
138+
### Custom System Prompt Files
139+
To use custom system prompt files, specify the file path to your custom system prompt. The file should be a markdown (.md) file.
140+
141+
System prompts define the AI model's behavior, tone, and approach to providing feedback. They are used to set the context and personality of the AI assistant.
128142

129143
## Models
130144
The models used can be seen under the ai_feedback/models folder.
@@ -304,11 +318,17 @@ python3 -m ai_feedback --prompt code_table --scope code \
304318
--model deepSeek-v3 --llama_mode cli
305319
```
306320

321+
307322
#### Get annotations for cnn_example test using openAI model
308323
```bash
309324
python -m ai_feedback --prompt code_annotations --scope code --submission test_submissions/cnn_example/cnn_submission --solution test_submissions/cnn_example/cnn_solution.py --model openai --json_schema ai_feedback/data/schema/code_annotation_schema.json
310325
```
311326

327+
#### Evaluate using custom prompt file path
328+
```bash
329+
python -m ai_feedback --prompt ai_feedback/data/prompts/user/code_overall.md --scope code --submission test_submissions/csc108/correct_submission/correct_submission.py --solution test_submissions/csc108/solution.py --model codellama:latest
330+
```
331+
312332
#### Using Ollama
313333
In order to run this project on Bigmouth:
314334
1. SSH into teach.cs
@@ -346,8 +366,6 @@ Files:
346366
- python_tester_llm_pdf.py: Runs LLM on any pdf assignment (solution file and submission file) uploaded to the autotester. Creates general feedback about whether the student's written responses matches the instructors feedback. Dislayed in test outputs and overall comments.
347367
- custom_tester_llm_code.sh: Runs LLM on assignments (solution file, submission file, test output file) uploaded to the custom autotester. Currently, supports jupyter notebook files uploaded. Can specify prompt and model used in the script. Displays in overall comments and in test outputs. Can optionally uncomment the annotations section to display annotations, however the annotations will display on the .txt version of the file uploaded by the student, not the .ipynb file.
348368

349-
<<<<<<< Updated upstream
350-
351369
#### Python AutoTester Usage
352370
##### Code Scope
353371
1. Ensure the student has submitted a submission file (_submission suffixed).
@@ -412,7 +430,7 @@ Also pip install other packages that the submission or solution file uses.
412430
- Student uploads: test1_submission.ipynb, test1_submission.txt
413431

414432
NOTE: if the LLM Test Group appears to be blank/does not turn green, try increasing the timeout.
415-
=======
433+
416434
#### Custom Tester
417435
- custom_tester_llm_code.sh: Runs LLM on any assignment (solution file, submission file, test output file) uploaded to the autotester. Can specify prompt and model used in the script. Displays in overall comments and in test outputs.
418436

@@ -435,4 +453,3 @@ To run the test suite:
435453
```console
436454
$ pytest
437455
```
438-
>>>>>>> Stashed changes

ai_feedback/__main__.py

Lines changed: 74 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
from . import code_processing, image_processing, text_processing
99
from .helpers import arg_options
10-
from .helpers.constants import HELP_MESSAGES, TEST_OUTPUTS_DIRECTORY
10+
from .helpers.constants import HELP_MESSAGES
1111

1212

1313
def detect_submission_type(filename: str) -> str:
@@ -52,26 +52,77 @@ def load_markdown_template(template: str) -> str:
5252
sys.exit(1)
5353

5454

55-
def load_markdown_prompt(prompt_name: str) -> dict:
56-
"""Loads a markdown prompt file.
55+
def _load_content_with_fallback(
56+
content_arg: str, predefined_values: list[str], predefined_subdir: str, content_type: str
57+
) -> str:
58+
"""Generic function to load content by trying pre-defined names first, then treating as file path.
5759
5860
Args:
59-
prompt_name (str): Name of the prompt file (without extension)
61+
content_arg (str): Either a pre-defined name or a file path
62+
predefined_values (list[str]): List of valid pre-defined names
63+
predefined_subdir (str): Subdirectory for pre-defined files (e.g., "user", "system")
64+
content_type (str): Type of content for error messages (e.g., "prompt", "system prompt")
6065
6166
Returns:
62-
dict: Dictionary containing prompt_content
67+
str: The content
6368
6469
Raises:
65-
SystemExit: If the prompt file is not found
70+
SystemExit: If the content cannot be loaded
6671
"""
67-
try:
68-
prompt_file = os.path.join(os.path.dirname(__file__), f"data/prompts/user/{prompt_name}.md")
69-
with open(prompt_file, "r") as file:
70-
prompt_content = file.read()
71-
return {"prompt_content": prompt_content}
72-
except FileNotFoundError:
73-
print(f"Error: Prompt file '{prompt_name}.md' not found in user subfolder.")
74-
sys.exit(1)
72+
# First, check if it's a pre-defined name
73+
if content_arg in predefined_values:
74+
try:
75+
file_path = os.path.join(os.path.dirname(__file__), f"data/prompts/{predefined_subdir}/{content_arg}.md")
76+
with open(file_path, "r", encoding='utf-8') as file:
77+
return file.read()
78+
except FileNotFoundError:
79+
print(
80+
f"Error: Pre-defined {content_type} file '{content_arg}.md' not found in {predefined_subdir} subfolder."
81+
)
82+
sys.exit(1)
83+
else:
84+
# Treat as a file path
85+
try:
86+
with open(content_arg, "r", encoding='utf-8') as file:
87+
return file.read()
88+
except FileNotFoundError:
89+
print(f"Error: {content_type.title()} file '{content_arg}' not found.")
90+
sys.exit(1)
91+
except Exception as e:
92+
print(f"Error reading {content_type} file '{content_arg}': {e}")
93+
sys.exit(1)
94+
95+
96+
def load_prompt_content(prompt_arg: str) -> str:
97+
"""Loads prompt content by trying pre-defined names first, then treating as file path.
98+
99+
Args:
100+
prompt_arg (str): Either a pre-defined prompt name or a file path
101+
102+
Returns:
103+
str: The prompt content
104+
105+
Raises:
106+
SystemExit: If the prompt cannot be loaded
107+
"""
108+
return _load_content_with_fallback(prompt_arg, arg_options.get_enum_values(arg_options.Prompt), "user", "prompt")
109+
110+
111+
def load_system_prompt_content(system_prompt_arg: str) -> str:
112+
"""Loads system prompt content by trying pre-defined names first, then treating as file path.
113+
114+
Args:
115+
system_prompt_arg (str): Either a pre-defined system prompt name or a file path
116+
117+
Returns:
118+
str: The system prompt content
119+
120+
Raises:
121+
SystemExit: If the system prompt cannot be loaded
122+
"""
123+
return _load_content_with_fallback(
124+
system_prompt_arg, arg_options.get_enum_values(arg_options.SystemPrompt), "system", "system prompt"
125+
)
75126

76127

77128
def main() -> int:
@@ -97,12 +148,10 @@ def main() -> int:
97148
parser.add_argument(
98149
"--prompt",
99150
type=str,
100-
choices=arg_options.get_enum_values(arg_options.Prompt),
101151
required=False,
102152
help=HELP_MESSAGES["prompt"],
103153
)
104154
parser.add_argument("--prompt_text", type=str, required=False, help=HELP_MESSAGES["prompt_text"])
105-
parser.add_argument("--prompt_custom", type=str, required=False, help=HELP_MESSAGES["prompt_custom"])
106155
parser.add_argument(
107156
"--scope",
108157
type=str,
@@ -147,7 +196,6 @@ def main() -> int:
147196
"--system_prompt",
148197
type=str,
149198
required=False,
150-
choices=arg_options.get_enum_values(arg_options.SystemPrompt),
151199
help=HELP_MESSAGES["system_prompt"],
152200
default="student_test_feedback",
153201
)
@@ -175,18 +223,12 @@ def main() -> int:
175223

176224
prompt_content = ""
177225

178-
system_prompt_path = os.path.join(
179-
os.path.dirname(os.path.abspath(__file__)), f"data/prompts/system/{args.system_prompt}.md"
180-
)
181-
with open(system_prompt_path, encoding='utf-8') as file:
182-
system_instructions = file.read()
226+
system_instructions = load_system_prompt_content(args.system_prompt)
183227

184-
if args.prompt_custom:
185-
prompt_filename = os.path.join("./", args.prompt_custom)
186-
with open(prompt_filename, encoding='utf-8') as prompt_file:
187-
prompt_content += prompt_file.read()
188-
else:
189-
if args.prompt:
228+
if args.prompt:
229+
# Only validate scope for pre-defined prompts (not for arbitrary file paths)
230+
predefined_prompts = arg_options.get_enum_values(arg_options.Prompt)
231+
if args.prompt in predefined_prompts:
190232
if not args.prompt.startswith("image") and args.scope == "image":
191233
print("Error: The prompt must start with 'image'. Please re-run the command with a valid prompt.")
192234
sys.exit(1)
@@ -197,14 +239,13 @@ def main() -> int:
197239
print("Error: The prompt must start with 'text'. Please re-run the command with a valid prompt.")
198240
sys.exit(1)
199241

200-
prompt = load_markdown_prompt(args.prompt)
201-
prompt_content += prompt["prompt_content"]
242+
prompt_content += load_prompt_content(args.prompt)
202243

203-
if args.prompt_text:
204-
prompt_content += args.prompt_text
244+
if args.prompt_text:
245+
prompt_content += args.prompt_text
205246

206247
if args.scope == "image":
207-
prompt["prompt_content"] = prompt_content
248+
prompt = {"prompt_content": prompt_content}
208249
request, response = image_processing.process_image(args, prompt, system_instructions)
209250
elif args.scope == "text":
210251
request, response = text_processing.process_text(args, prompt_content, system_instructions)

ai_feedback/helpers/constants.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,8 @@
11
TEST_OUTPUTS_DIRECTORY = "test_responses_md"
22
HELP_MESSAGES = {
33
"submission_type": "The format of the submission file (e.g., Jupyter notebook, Python script).",
4-
"prompt": "The specific prompt to use for evaluating the assignment.",
4+
"prompt": "Pre-defined prompt name (from ai_feedback/data/prompts/user/) or file path to custom prompt file.",
55
"prompt_text": "Additional messages to concatenate to the prompt.",
6-
"prompt_custom": "The path to a prompt to use.",
76
"scope": "The section of the assignment the model should analyze (e.g., code or image).",
87
"submission": "The file path for the submission file.",
98
"solution": "The file path for the solution file.",
@@ -15,6 +14,6 @@
1514
"test_output": "The output of tests from evaluating the assignment.",
1615
"submission_image": "The file path for the image file.",
1716
"solution_image": "The file path to the solution image.",
18-
"system_prompt": "The specific system instructions to send to the AI Model.",
1917
"json_schema": "file path to a json file that contains the schema for ai output",
18+
"system_prompt": "Pre-defined system prompt name (from ai_feedback/data/prompts/system/) or file path to custom system prompt file.",
2019
}

ai_feedback/helpers/template_utils.py

Lines changed: 9 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -104,28 +104,21 @@ def gather_file_contents(assignment_files: List[Optional[Path]]) -> str:
104104
# Handle PDF files separately
105105
if filename.lower().endswith('.pdf'):
106106
text_content = extract_pdf_text(file_path)
107-
file_contents += f"=== {filename} ===\n"
108107
lines = text_content.split('\n')
109-
for i, line in enumerate(lines, start=1):
110-
stripped_line = line.rstrip()
111-
if stripped_line.strip():
112-
file_contents += f"(Line {i}) {stripped_line}\n"
113-
else:
114-
file_contents += f"(Line {i}) \n"
115-
file_contents += "\n"
116108
else:
117109
# Handle regular text files
118110
with open(file_path, "r", encoding="utf-8") as file:
119111
lines = file.readlines()
120112

121-
file_contents += f"=== {filename} ===\n"
122-
for i, line in enumerate(lines, start=1):
123-
stripped_line = line.rstrip("\n")
124-
if stripped_line.strip():
125-
file_contents += f"(Line {i}) {stripped_line}\n"
126-
else:
127-
file_contents += f"(Line {i}) {line}"
128-
file_contents += "\n"
113+
# Common processing for both file types
114+
file_contents += f"=== {filename} ===\n"
115+
for i, line in enumerate(lines, start=1):
116+
stripped_line = line.rstrip('\n').rstrip()
117+
if stripped_line.strip():
118+
file_contents += f"(Line {i}) {stripped_line}\n"
119+
else:
120+
file_contents += f"(Line {i}) \n"
121+
file_contents += "\n"
129122

130123
except Exception as e:
131124
print(f"Error reading file {filename}: {e}")

promptfoo/promptfoo_test_runner.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,6 @@ def call_api(prompt: str, context: dict, metadata: dict) -> dict:
3838
options['prompt'],
3939
"--llama_mode",
4040
"server",
41-
'--submission_type',
42-
submission_type,
4341
"--output_template",
4442
"response_and_prompt",
4543
]

promptfoo/tests/codellama_tests/codellama_code_tests.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ defaultTest:
55
vars:
66
model: codellama:latest
77
scope: code
8-
submission_type: python
98

109
scenarios:
1110
- config:

promptfoo/tests/deepseek_r1_tests/deepseek_r1_code_tests.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ defaultTest:
55
vars:
66
model: deepSeek-R1:70B
77
scope: code
8-
submission_type: python
98

109
scenarios:
1110
- config:

promptfoo/tests/deepseek_r1_tests/deepseek_r1_text_tests.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ defaultTest:
55
vars:
66
model: deepSeek-R1:70B
77
scope: text
8-
submission_type: pdf
98

109
scenarios:
1110
- config:

promptfoo/tests/deepseek_v3_tests/deepseek_v3_code_tests.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ defaultTest:
55
vars:
66
model: deepSeek-v3
77
scope: code
8-
submission_type: python
98

109
scenarios:
1110
- config:

promptfoo/tests/deepseek_v3_tests/deepseek_v3_text_tests.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ defaultTest:
55
vars:
66
model: deepSeek-v3
77
scope: text
8-
submission_type: pdf
98

109
scenarios:
1110
- config:

0 commit comments

Comments
 (0)