Skip to content

Commit e1d15ea

Browse files
committed
update readme
1 parent 2f33e20 commit e1d15ea

File tree

1 file changed

+62
-2
lines changed

1 file changed

+62
-2
lines changed

README.md

Lines changed: 62 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,67 @@
11
# CS Word Cloud
22

3-
TODO: Explain methods
3+
# Requirements
44

5-
TLDR: All this word to make this word cloud:
5+
* Python 3.x
6+
* numpy, wordcloud, and any other misc pip/conda packages
7+
* Golang
8+
* GNU make
9+
* GNU coreutils
10+
* Bash or Zsh
11+
12+
# How To Build
13+
14+
0. Install the prerequesites
15+
1. Setup your environment by setting the variables in `Makefile`. The main variables to set are `APIKEY`, `INITMATCH`, and `MATCH_COUNT`
16+
17+
For example, before:
18+
```
19+
APIKEY?=TODO # Add server api key here from https://developers.faceit.com/
20+
INITMATCH?=TODO # Add any recent faceit match ID here
21+
MATCH_COUNT=1000 # Number of demos to download
22+
SHELL=/bin/bash # Need this just so I can use pipefail :/
23+
```
24+
25+
After:
26+
```
27+
APIKEY?=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx # Add server api key here from https://developers.faceit.com/
28+
INITMATCH?=1-a993a412-8987-4d11-a682-dbe2fae3a761 # Add any recent faceit match ID here
29+
MATCH_COUNT=5 # Let's only do 5 demos for a short test
30+
SHELL=/bin/zsh # Say I have a macbook lets use zsh
31+
```
32+
33+
2. Run `make all`
34+
3. Get a cool word cloud like this:
635

736
![Drag Racing](rev0.png)
37+
38+
# How It Works
39+
40+
## Step 1: Traverse through FACEIT API for some matches
41+
42+
The basic traversal goes like this: given some initial match id, choose a random player in that match. Then choose a random match in their recent match history. And so on. This gave me a decently "random" sample of demos from a variety of regions and skill levels.
43+
44+
## Step 2: Download the demos
45+
46+
I'm sure their are other ways to do this. However, I was able to download all 1000 demos in my dataset through these 3 URLS in `cdns.txt`:
47+
48+
```
49+
https://storage.googleapis.com/demos-us-central1.faceit-cdn.net
50+
https://storage.googleapis.com/demos-europe-west1.faceit-cdn.net
51+
https://storage.googleapis.com/demos-europe-west2.faceit-cdn.net
52+
https://storage.googleapis.com/demos-asia-southeast1.faceit-cdn.net
53+
```
54+
55+
The `download.sh` script already handles the demo request automatically given `cdns.txt` is there.
56+
57+
## Step 3: Parse the words
58+
59+
All I had to do was write a small method in Go using the API provided from https://github.com/markus-wa/demoinfocs-golang. It dumps all the chat text to `stdout`. Then I just `cat` them together for the word cloud generator.
60+
61+
## Step 4: Generate the word cloud
62+
63+
I mainly followed this example here https://github.com/amueller/word_cloud/blob/main/examples/masked.py. I made my own stencil with GIMP and played around with the parameters.
64+
65+
# Using this work to download large collections of demos
66+
67+
The scripts in this repo may be of interest for those doing data science / statistics on CSGO games on the general population. Just use `scrapeGames.py` and `download.sh` scripts, and you should be able to get pretty large datasets in no time. I was able to get 1000 demos using 150GB and only a handful of hours.

0 commit comments

Comments
 (0)