This program allows you to fetch all cars from otomoto.pl
pip install -r requirements.txt
scrapy crawl -L WARNING otomoto -o otomoto.json
This will generate otomoto.json
file with all cars that are currently
available. You can further investigate them or create some analysis.
This fork was created to be able to download more offers by working around the pagination limit.
You can get your own list of urls (start_urls in file \otomoto\spiders\otomoto.py) with car models as follows:
- Make GET request to otomoto api: https://www.otomoto.pl/api/open/categories/
- Find response part with models and copy it to any text editor (e.g. Notepad++):
- Replace some response parts using regex:
3.1. [.+(\s)+.+] pattern replace with nothing
3.2. :\{.*?\}, pattern replace with \1,\n
3.3. " pattern replace with '
- Using multi editing (ALT+SHIFT), add the base url (https://www.otomoto.pl/osobowe/) before each model name.
- Paste created list to \otomoto\spiders\otomoto.py file and enjoy!