Skip to content

Subject: Question on Experimental Data Collection in Table 2 (Win Rate across Map Scenarios) #2

@lzc-ust

Description

@lzc-ust

Dear Authors,

First, thank you for sharing this excellent paper and the accompanying code. I am currently working on reproducing the experiments and have some detailed questions regarding the results in Table 2 (comparing Win Rates across different Map Scenarios).

To ensure I am correctly replicating the experimental setup, I would be grateful if you could clarify the following details about the evaluation protocol:

  1. Evaluation Scripts & Repetition: Could you specify which exact evaluation scripts were used to generate the results for Table 2? Furthermore, what was the number of runs for each script/scenario? (For example, was each map scenario evaluated by running multiprocess_run_env.py 10 times and then aggregating the results?)

  2. Win Rate Calculation: What is the precise formula used to calculate the Win Rate? For instance, is it:

    • Win Rate = (Number of Wins / Total Number of Episodes) * 100%
    • Or does it involve a different approach, such as calculating the rate against multiple specific opponents or under certain conditions?

I am using the Qwen model in my reproduction efforts and want to make sure my methodology aligns with the paper's. These specifics would greatly help me in matching the experimental conditions and understanding the results.

Thank you for your time and consideration.

Best regards,
Zhicheng LI

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions