Skip to content

Commit d750f4a

Browse files
Added mmlu accuracy test command format (#293)
* Added mmlu accuracy test command format Signed-off-by: Victoria Godsoe <victoria.godsoe@amd.com> * Update docs/lemonade/mmlu_accuracy.md Co-authored-by: Jeremy Fowers <80718789+jeremyfowers@users.noreply.github.com> Signed-off-by: Victoria Godsoe <victoria.godsoe@amd.com> * Update docs/lemonade/mmlu_accuracy.md Co-authored-by: Jeremy Fowers <80718789+jeremyfowers@users.noreply.github.com> Signed-off-by: Victoria Godsoe <victoria.godsoe@amd.com> --------- Signed-off-by: Victoria Godsoe <victoria.godsoe@amd.com> Co-authored-by: Jeremy Fowers <80718789+jeremyfowers@users.noreply.github.com>
1 parent adbc520 commit d750f4a

File tree

1 file changed

+61
-59
lines changed

1 file changed

+61
-59
lines changed

docs/lemonade/mmlu_accuracy.md

Lines changed: 61 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -39,62 +39,64 @@ The model is expected to generate an answer to the test question based on the co
3939

4040
## Detailed list of subjects/ categories tested
4141

42-
| Test Subject | Category |
43-
|----------------------------------|-------------------|
44-
| Abstract Algebra | Math |
45-
| Anatomy | Health |
46-
| Astronomy | Physics |
47-
| Business Ethics | Business |
48-
| Clinical Knowledge | Health |
49-
| College Biology | Biology |
50-
| College Chemistry | Chemistry |
51-
| College Computer Science | Computer Science |
52-
| College Mathematics | Math |
53-
| College Medicine | Health |
54-
| College Physics | Physics |
55-
| Computer Security | Computer Science |
56-
| Conceptual Physics | Physics |
57-
| Econometrics | Economics |
58-
| Electrical Engineering | Engineering |
59-
| Elementary Mathematics | Math |
60-
| Formal Logic | Philosophy |
61-
| Global Facts | Other |
62-
| High School Biology | Biology |
63-
| High School Chemistry | Chemistry |
64-
| High School Computer Science | Computer Science |
65-
| High School European History | History |
66-
| High School Geography | Geography |
67-
| High School Government and Politics | Politics |
68-
| High School Macroeconomics | Economics |
69-
| High School Mathematics | Math |
70-
| High School Microeconomics | Economics |
71-
| High School Physics | Physics |
72-
| High School Psychology | Psychology |
73-
| High School Statistics | Math |
74-
| High School US History | History |
75-
| High School World History | History |
76-
| Human Aging | Health |
77-
| Human Sexuality | Culture |
78-
| International Law | Law |
79-
| Jurisprudence | Law |
80-
| Logical Fallacies | Philosophy |
81-
| Machine Learning | Computer Science |
82-
| Management | Business |
83-
| Marketing | Business |
84-
| Medical Genetics | Health |
85-
| Miscellaneous | Other |
86-
| Moral Disputes | Philosophy |
87-
| Moral Scenarios | Philosophy |
88-
| Nutrition | Health |
89-
| Philosophy | Philosophy |
90-
| Prehistory | History |
91-
| Professional Accounting | Other |
92-
| Professional Law | Law |
93-
| Professional Medicine | Health |
94-
| Professional Psychology | Psychology |
95-
| Public Relations | Politics |
96-
| Security Studies | Politics |
97-
| Sociology | Culture |
98-
| US Foreign Policy | Politics |
99-
| Virology | Health |
100-
| World Religions | Philosophy |
42+
Use the syntax provided in the table to run that test subject with the `accuracy-mmlu` tool. For example, To run the "Abstract Algebra" subject, use `accuracy-mmlu --tests abstract_algebra`.
43+
44+
| Test Subject | Category | `--tests` syntax |
45+
|-------------------------------------|-------------------|-------------------------------------|
46+
| Abstract Algebra | Math | abstract_algebra |
47+
| Anatomy | Health | anatomy |
48+
| Astronomy | Physics | astronomy |
49+
| Business Ethics | Business | business_ethics |
50+
| Clinical Knowledge | Health | clinical_knowledge |
51+
| College Biology | Biology | college_biology |
52+
| College Chemistry | Chemistry | college_chemistry |
53+
| College Computer Science | Computer Science | college_computer_science |
54+
| College Mathematics | Math | college_mathematics |
55+
| College Medicine | Health | college_medicine |
56+
| College Physics | Physics | college_physics |
57+
| Computer Security | Computer Science | computer_security |
58+
| Conceptual Physics | Physics | conceptual_physics |
59+
| Econometrics | Economics | econometrics |
60+
| Electrical Engineering | Engineering | electrical_engineering |
61+
| Elementary Mathematics | Math | elementary_mathematics |
62+
| Formal Logic | Philosophy | formal_logic |
63+
| Global Facts | Other | global_facts |
64+
| High School Biology | Biology | high_school_biology |
65+
| High School Chemistry | Chemistry | high_school_chemistry |
66+
| High School Computer Science | Computer Science | high_school_computer_science |
67+
| High School European History | History | high_school_european_history |
68+
| High School Geography | Geography | high_school_geography |
69+
| High School Government and Politics | Politics | high_school_government_and_politics |
70+
| High School Macroeconomics | Economics | high_school_macroeconomics |
71+
| High School Mathematics | Math | high_school_mathematics |
72+
| High School Microeconomics | Economics | high_school_microeconomics |
73+
| High School Physics | Physics | high_school_physics |
74+
| High School Psychology | Psychology | high_school_psychology |
75+
| High School Statistics | Math | high_school_statistics |
76+
| High School US History | History | high_school_us_history |
77+
| High School World History | History | high_school_world_history |
78+
| Human Aging | Health | human_aging |
79+
| Human Sexuality | Culture | human_sexuality |
80+
| International Law | Law | international_law |
81+
| Jurisprudence | Law | jurisprudence |
82+
| Logical Fallacies | Philosophy | logical_fallacies |
83+
| Machine Learning | Computer Science | machine_learning |
84+
| Management | Business | management |
85+
| Marketing | Business | marketing |
86+
| Medical Genetics | Health | medical_genetics |
87+
| Miscellaneous | Other | miscellaneous |
88+
| Moral Disputes | Philosophy | moral_disputes |
89+
| Moral Scenarios | Philosophy | moral_scenarios |
90+
| Nutrition | Health | nutrition |
91+
| Philosophy | Philosophy | philosophy |
92+
| Prehistory | History | prehistory |
93+
| Professional Accounting | Other | professional_accounting |
94+
| Professional Law | Law | professional_law |
95+
| Professional Medicine | Health | professional_medicine |
96+
| Professional Psychology | Psychology | professional_psychology |
97+
| Public Relations | Politics | public_relations |
98+
| Security Studies | Politics | security_studies |
99+
| Sociology | Culture | sociology |
100+
| US Foreign Policy | Politics | us_foreign_policy |
101+
| Virology | Health | virology |
102+
| World Religions | Philosophy | world_religions |

0 commit comments

Comments
 (0)