-
Notifications
You must be signed in to change notification settings - Fork 10
[Exam - Lab] Pintos - PKU Labs and CS162 Exam #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Outdated
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Outdated
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Outdated
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Outdated
Show resolved
Hide resolved
...exam_bench/data/raw/cs162_operating_systems_and_system_programming_summer_2022_final/exam.md
Outdated
Show resolved
Hide resolved
benchmarks/courselab_bench/data/pku_pintos/task_1_threads/config.json
Outdated
Show resolved
Hide resolved
xuafeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall is great.
I left a few comments for exam.md.
Thanks a lot.
|
I saw the lab is hard to manually review. How to manually start CI tests for this? |
ee9800e to
2d43ddb
Compare
|
@tareknaser please merge it. |
|
Tested against Claude models as a sanity check. Sonnet scored 0.8 and the failures are reasonable given the reference solution. |
9c3787b to
654f380
Compare
This PR adds OS course material to the benchmark:
Also included reference solutions for all labs that pass the provided tests
I chose PKU’s Pintos version of the labs because it’s fully open source and well documented (similar courses that use Pintos for labs exist at Stanford, Berkeley, and JHU)