Skip to content

Conversation

@tareknaser
Copy link
Collaborator

@tareknaser tareknaser commented Nov 11, 2025

adding some GH repo infrastructure and doc updates:

  • Adds ArtEval to README and links benchmark directories
  • Adds CI workflow for benchmark tests (pytest in parallel)
  • Adds GH issue/PR templates
  • Add a test for example_bench
  • Scripts to be executable

Each change is in a separate commit. If any change is not needed right now, let me know and I can remove that commit and keep the others.

@tareknaser tareknaser marked this pull request as draft November 11, 2025 16:08
@tareknaser
Copy link
Collaborator Author

Added CI workflow for running benchmark tests. Had to do some debugging. I first wrote the workflow to run tests for each benchmark in parallel using a matrix but the existing test_sdk.py files were empty and broke the CI. They were also importing from the wrong directory and testing the wrong Evaluator class. Cleaned that up and removed the broken tests.

I also added a lightweight test for example_bench. It’s pretty minimal and doesn’t check much. just meant to show how to structure tests for benchmarks.

For now, the CI only runs tests for example_bench since that’s the only one with actual tests. The other benchmarks are commented out in the matrix with a TODO note. Once we add tests for them, they’ll automatically run in parallel because we have the matrix setup in place

@tareknaser tareknaser marked this pull request as ready for review November 11, 2025 16:32
@xuafeng
Copy link
Collaborator

xuafeng commented Nov 11, 2025

Added CI workflow for running benchmark tests. Had to do some debugging. I first wrote the workflow to run tests for each benchmark in parallel using a matrix but the existing test_sdk.py files were empty and broke the CI. They were also importing from the wrong directory and testing the wrong Evaluator class. Cleaned that up and removed the broken tests.

I also added a lightweight test for example_bench. It’s pretty minimal and doesn’t check much. just meant to show how to structure tests for benchmarks.

For now, the CI only runs tests for example_bench since that’s the only one with actual tests. The other benchmarks are commented out in the matrix with a TODO note. Once we add tests for them, they’ll automatically run in parallel because we have the matrix setup in place

Thanks a lot. Tarek, It is very helpful. I think we can force the contributor to add test later.

@xuafeng xuafeng closed this Nov 11, 2025
@xuafeng xuafeng reopened this Nov 11, 2025
@xuafeng xuafeng requested review from guozhongxin and xuafeng and removed request for xuafeng November 11, 2025 21:44
@xuafeng
Copy link
Collaborator

xuafeng commented Nov 11, 2025

@guozhongxin I already review this PR and all changes make sense to me. Can you please also take a look? Thanks.

@guozhongxin
Copy link
Collaborator

@guozhongxin I already review this PR and all changes make sense to me. Can you please also take a look? Thanks.

reviewed. approved with one comment

@tareknaser tareknaser merged commit ea9b54d into main Nov 13, 2025
1 check passed
@tareknaser tareknaser deleted the readme_arteval branch November 13, 2025 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants