Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Logging more to wandb:
Other:
Experiments in Hypercloud
I used an 8xA6000 to see how it performs the base B5W3 experiment.

I run with different number of workers and parallel games and got this:
The runs are:
The one with 240 games took more than 3 minutes to start completing games, and the slope is about the same than the 120 workers, so it seems that there's no advantage on using so many workers. I can see that the GPUs are at 99%,. so that's the bottleneck.
The one that performed the best is with 120 workers and 16 parallel games, but just a bit better than 60 workers. I'll try increasing the number of parallel games anyway. But in any case, I'd expect the setup to be highly dependent on the board size, mcts_n, network config, etc, so I just want to have an idea, I don't think this would translate to any runs.
For the longer run, I can see the P2 wins:

So, it's winning 100% of the times, even against simple with a big branch factor and depth 6.