Skip to content

Conversation

@Qian-Cheng-nju
Copy link
Collaborator

@Qian-Cheng-nju Qian-Cheng-nju commented Nov 14, 2025

Summary

Add SysMoBench benchmark - evaluates AI models' ability to generate correct TLA+ formal specifications for real-world concurrent and distributed systems.

@Qian-Cheng-nju Qian-Cheng-nju changed the title Sysmobench Integrate SysMoBench benchmark via Git Subtree Nov 14, 2025
@xuafeng xuafeng requested review from tareknaser and tianyin and removed request for tareknaser November 14, 2025 22:26
Copy link
Collaborator

@xuafeng xuafeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your great contribution to this project. The integration is great. I left a few suggestions.

### Contribute to existing Benchmarks
The easiest way to contribute is to add more tasks to existing benchmarks. For example, you can add more questions to the course exam benchmark or more projects to the course project benchmark. You can add more system algorithm design problems into algorithm design benchmark. Please follow the existing format and structure for adding new tasks. You can also improve the existing benchmarks by adding more advanced evaluators with improved metrics.

### Creating New Benchmarks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the slack, can you please help add a section called "### Porting your benchmark", and using SysMoBench as example? Thanks a lot.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve written a draft of the porting benchmark guide. When you have time, I would greatly appreciate your review. Please feel free to let me know if anything should be improved. Thank you very much! See commit 72b6916

@xuafeng
Copy link
Collaborator

xuafeng commented Nov 14, 2025

Resolves #7

@xuafeng xuafeng merged commit faa1ff3 into main Nov 17, 2025
1 check passed
@xuafeng xuafeng deleted the sysmobench branch November 17, 2025 18:10
Couen pushed a commit to Couen/system-intelligence-benchmark that referenced this pull request Jan 22, 2026
…ench

Integrate SysMoBench benchmark via Git Subtree
tareknaser pushed a commit that referenced this pull request Feb 5, 2026
Integrate SysMoBench benchmark via Git Subtree
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants