Subject: Question on Experimental Data Collection in Table 2 (Win Rate across Map Scenarios)

Dear Authors,

First, thank you for sharing this excellent paper and the accompanying code. I am currently working on reproducing the experiments and have some detailed questions regarding the results in **Table 2** (comparing Win Rates across different Map Scenarios).

To ensure I am correctly replicating the experimental setup, I would be grateful if you could clarify the following details about the evaluation protocol:

1.  **Evaluation Scripts & Repetition:** Could you specify which exact evaluation scripts were used to generate the results for Table 2? Furthermore, what was the number of runs for each script/scenario? (For example, was each map scenario evaluated by running `multiprocess_run_env.py` 10 times and then aggregating the results?)

2.  **Win Rate Calculation:** What is the precise formula used to calculate the Win Rate? For instance, is it:
    *   `Win Rate = (Number of Wins / Total Number of Episodes) * 100%`
    *   Or does it involve a different approach, such as calculating the rate against multiple specific opponents or under certain conditions?

I am using the Qwen model in my reproduction efforts and want to make sure my methodology aligns with the paper's. These specifics would greatly help me in matching the experimental conditions and understanding the results.

Thank you for your time and consideration.

Best regards,
Zhicheng LI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subject: Question on Experimental Data Collection in Table 2 (Win Rate across Map Scenarios) #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Subject: Question on Experimental Data Collection in Table 2 (Win Rate across Map Scenarios) #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions