🧠 llm-order-eval

llm-order-eval is a benchmark study comparing the performance of 11 large language models (LLMs) in parsing natural-language shopping-cart commands into structured JSON. This evaluation explores how prompt design and model architecture influence task performance in structured information extraction.

📌 Objective

To evaluate LLMs on their ability to convert free-text shopping-cart instructions into a structured JSON format with three fields:

{
  "action": "add" | "remove",
  "product": "string",
  "quantity": integer (default = 1)
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Prompt 1		Prompt 1
Prompt 2		Prompt 2
Prompt 3		Prompt 3
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Retail_Dataset_1000.csv		Retail_Dataset_1000.csv
requirements.txt		requirements.txt
script_f1_score.py		script_f1_score.py
script_openai.py		script_openai.py
script_oss.py		script_oss.py
setup_linux_env.sh		setup_linux_env.sh
setup_windows_env.sh		setup_windows_env.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 llm-order-eval

📌 Objective

About

Uh oh!

Releases

Packages

Languages

License

rkarlovic/llm-order-eval

Folders and files

Latest commit

History

Repository files navigation

🧠 llm-order-eval

📌 Objective

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages