Skip to content

Commit 53f8b71

Browse files
authored
Switch scraper season to 2023 (#183)
1 parent 14c3abf commit 53f8b71

File tree

8 files changed

+14
-18
lines changed

8 files changed

+14
-18
lines changed

.github/workflows/on-schedule.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ on:
1111
description: Commit message for the 'add-and-commit' step
1212
acquire_args:
1313
required: false
14-
default: "--asset all --seasons 2022"
14+
default: "--asset all --seasons 2023"
1515
description: Arguments to be passed to the acquiring script
1616

1717
jobs:
@@ -50,7 +50,7 @@ jobs:
5050
run: |
5151
# trigger prep job in aws batch
5252
53-
DEFAULT_ARGS="--asset all --seasons 2022"
53+
DEFAULT_ARGS="--asset all --seasons 2023"
5454
EFFECTIVE_ARGS=${ARGS:-${DEFAULT_ARGS}}
5555
5656
DAY=$(date +"%Y-%m-%d")

Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
PLATFORM = linux/arm64
33
BRANCH = $(shell git rev-parse --abbrev-ref HEAD)
44
JOB_NAME = on-cli
5-
ARGS = --asset all --seasons 2022
5+
ARGS = --asset all --seasons 2023
66
MESSAGE = some message
77
TAG = dev
88

@@ -94,7 +94,7 @@ sync: ## run the sync process (refreshes data frontends)
9494
sync: MESSAGE = Manual sync
9595
sync:
9696
gunzip -r data/prep/*.csv.gz && \
97-
PYTHONPATH=$(PYTHONPATH):`pwd`/. python scripts/sync.py --message "$(MESSAGE)" --season 2022 && \
97+
PYTHONPATH=$(PYTHONPATH):`pwd`/. python scripts/sync.py --message "$(MESSAGE)" --season 2023 && \
9898
gzip -r data/prep/*.csv
9999

100100
streamlit_local: ## run streamlit app locally

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ path | description
119119
In the scope of this project, "acquiring" is the process of collecting "raw data", as it is produced by [trasfermarkt-scraper](https://github.com/dcaribou/transfermarkt-scraper). Acquired data lives in the `data/raw` folder and it can be created or updated for a particular season by running `make acquire_local`
120120

121121
```console
122-
make acquire_local ARGS="--asset all --season 2022"
122+
make acquire_local ARGS="--asset all --season 2023"
123123
```
124124
This runs the scraper with a set of parameters and collects the output in `data/raw`.
125125

config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
defintions:
22
source_path: data/raw
3-
seasons: [2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022]
3+
seasons: [2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023]
44
console_handler: &console_handler
55
class: logging.StreamHandler
66
level: INFO

data/prep.dvc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
outs:
2-
- md5: 469684632bc6badc7afbfd2a4233d14a.dir
3-
size: 44784882
2+
- md5: 23e4b9082d0397e81f47c404acf87cca.dir
3+
size: 44034795
44
nfiles: 9
55
path: prep

data/raw.dvc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
outs:
2-
- md5: b2b8a732ee1b78e4713aadc343647e1c.dir
3-
size: 145761038
4-
nfiles: 44
2+
- md5: cb0235e51acaff8584c1948fab9a0b73.dir
3+
size: 163701852
4+
nfiles: 48
55
path: raw

dbt/models/curated/models.yml

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ models:
5757
- club_id
5858
- dbt_expectations.expect_table_row_count_to_be_between:
5959
min_value: 400
60-
max_value: 420
60+
max_value: 430
6161
columns:
6262
- name: club_code
6363
tests:
@@ -141,7 +141,7 @@ models:
141141
- name: market_value_in_eur
142142
tests:
143143
- too_many_missings:
144-
tolerance: 0.37
144+
tolerance: 0.38
145145
- name: contract_expiration_date
146146
tests:
147147
- too_many_missings:
@@ -167,10 +167,6 @@ models:
167167
- dbt_expectations.expect_table_row_count_to_be_between:
168168
min_value: 400000
169169
max_value: 440000
170-
columns:
171-
- name: market_value_in_eur
172-
tests:
173-
- not_null
174170
columns:
175171
- name: player_id
176172
tests:

streamlit/pages/04_📈_analysis:_manager_performance.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
# define the set of leagues to be used in the analysis
3333

3434
DEFAULT_COMPETITIONS = ["GB1", "L1", "ES1", "IT1"]
35-
DEFAULT_SEASONS = [2021, 2022]
35+
DEFAULT_SEASONS = [2021, 2022, 2023]
3636
DEFAULT_MANAGERS = ["Jürgen Klopp", "Pep Guardiola"]
3737
DEFAULT_COMPETITION_TYPES = [
3838
"domestic_cup", "domestic_league", "international_cup"

0 commit comments

Comments
 (0)