LiveSWEBench: New benchmark from creators of LiveBench #140
bogdan0083
started this conversation in
1. Feature requests
Replies: 1 comment 1 reply
-
That’s a great idea, @bogdan0083! I believe it's very important to do evaluations frequently. We’ll begin working on this once we feel more comfortable with our product and have fixed some stuff we're working on right now. What other evals do you think might be relevant for us? I think about Aider Polyglot too, I talked a bit about them in our blog, let me know if you have any other eval ideas! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Basically it's a benchmark for code agents: https://liveswebench.ai/
Might be interesting to test Kilo Code on that one.
Beta Was this translation helpful? Give feedback.
All reactions