Skip to content

prosperity-solutions/vermittler-scraper

Repository files navigation

Vermittler-Scraper

Scrape broker registration data from the IHK Vermittlerregister by registration number.

What it does

Takes an IHK broker registration number (e.g. D-21RP-R1O5O-37) and returns structured broker details as JSON. The site uses Friendly Captcha (PoW-based) — this scraper bypasses it automatically.

Two approaches

Script Method Speed Dependencies
scrape.py Direct PDF endpoint fetch → parse PDF Fast (~2s) requests, pypdf, scrapling, playwright
scrape.js Stealth Playwright browser + auto captcha solve Slower (~10s) playwright-extra, puppeteer-extra-plugin-stealth

The Python script (scrape.py) tries the direct PDF download first. If that's captcha-blocked, it falls back to browser-based extraction via Scrapling/Playwright.

Usage

Python (recommended)

pip install -r requirements.txt
playwright install chromium

python scrape.py D-21RP-R1O5O-37

Node.js (fallback)

npm install
npx playwright install chromium

node scrape.js D-21RP-R1O5O-37

Both output JSON to stdout.

License

ISC

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors