Skip to content

Cannot reproduce Lite using Agentless 1.5 with GPT-4o #68

Open
@yeliu918

Description

@yeliu918

Hi @brutalsavage,

Thanks for the amazing work and for open-sourcing the repo — it’s been incredibly helpful for reproducing SWE-Bench.
I recently ran Agentless experiments on SWE-Bench-Lite following your instructions, using all four localization files.
Here are my results:

All instances run.
Cleaning cached images...
Removed 0 images.
Total instances: 300
Instances submitted: 300
Instances completed: 297
Instances incomplete: 0
Instances resolved: 82
Instances unresolved: 215
Instances with empty patches: 3
Instances with errors: 0
Unstopped containers: 0
Unremoved images: 300

I got 82/300 resolved, which is lower than the 96/300 reported in the paper. I'm wondering:

  1. What's the key difference between Agentless 1.5 and Agentless 1.0?

  2. Does Agentless 1.5 use gpt-4o-2024-05-13, or a more recent version of GPT-4o?

Thanks again for the great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions