Skip to content

Commit 900ae0e

Browse files
author
David
committed
add devcontainer confg for codespaces
1 parent a94d3d7 commit 900ae0e

File tree

5 files changed

+34
-4
lines changed

5 files changed

+34
-4
lines changed

.devcontainer/devcontainer.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
2+
// README at: https://github.com/devcontainers/templates/tree/main/src/debian
3+
{
4+
"name": "Debian",
5+
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
6+
"image": "mcr.microsoft.com/devcontainers/base:bullseye",
7+
"features": {
8+
"ghcr.io/devcontainers-contrib/features/poetry:2": {}
9+
},
10+
11+
// Features to add to the dev container. More info: https://containers.dev/features.
12+
// "features": {},
13+
14+
// Use 'forwardPorts' to make a list of ports inside the container available locally.
15+
// "forwardPorts": [],
16+
17+
// Configure tool-specific properties.
18+
// "customizations": {},
19+
20+
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
21+
// "remoteUser": "root"
22+
23+
"hostRequirements": {
24+
"cpus": 2,
25+
"memory": "2gb",
26+
"storage": "4gb"
27+
},
28+
"postCreateCommand": "poetry install"
29+
}

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11

22
![checks status](https://github.com/dcaribou/transfermarkt-scraper/workflows/Scrapy%20Contracts%20Checks/badge.svg)
33
![docker build status](https://github.com/dcaribou/transfermarkt-scraper/workflows/Dockerhub%20Image/badge.svg)
4+
45
# transfermarkt-scraper
56

67
A web scraper for collecting data from [Transfermarkt](https://www.transfermarkt.co.uk/) website. It recurses into the Transfermarkt hierarchy to find
@@ -16,6 +17,7 @@ A web scraper for collecting data from [Transfermarkt](https://www.transfermarkt
1617
Each one of these entities can be discovered and refreshed separately by invoking the corresponding crawler.
1718

1819
## Installation
20+
> [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/dcaribou/transfermarkt-scraper/tree/main?quickstart=1)
1921
2022
This is a [scrapy](https://scrapy.org/) project, so it needs to be run with the
2123
`scrapy` command line util. This and all other required dependencies can be installed using [poetry](https://python-poetry.org/docs/).

tfmkt/spiders/common.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def __init__(self, base_url=None, parents=None, season=None):
6363
if season:
6464
self.season = season
6565
else:
66-
self.season = 2022
66+
self.season = 2024
6767

6868
self.entrypoints = parents
6969

@@ -84,7 +84,6 @@ def start_requests(self):
8484
item['seasoned_href'] = self.seasonize_entrypoin_href(item)
8585
applicable_items.append(item)
8686

87-
8887
return [
8988
Request(
9089
item['seasoned_href'],

tfmkt/spiders/competitions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ def parse(self, response, parent):
4545

4646
total_value = row.css('td:nth-of-type(8)::text').get()
4747

48-
matches = re.search('([0-9]+)\.png', country_image_url, re.IGNORECASE)
48+
matches = re.search(r'([0-9]+)\.png', country_image_url, re.IGNORECASE)
4949
country_id = matches.group(1)
5050

5151
href = "/wettbewerbe/national/wettbewerbe/" + country_id

tfmkt/spiders/players.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ def parse_market_history(self, response: Response):
121121
"""
122122
Parse player's market history from the graph
123123
"""
124-
pattern = re.compile('\'data\'\:.*\}\}]')
124+
pattern = re.compile(r'\'data\'\:.*\}\}]')
125125

126126
try:
127127
parsed_script = json.loads(

0 commit comments

Comments
 (0)