Scrape broker registration data from the IHK Vermittlerregister by registration number.
Takes an IHK broker registration number (e.g. D-21RP-R1O5O-37) and returns structured broker details as JSON. The site uses Friendly Captcha (PoW-based) — this scraper bypasses it automatically.
| Script | Method | Speed | Dependencies |
|---|---|---|---|
scrape.py |
Direct PDF endpoint fetch → parse PDF | Fast (~2s) | requests, pypdf, scrapling, playwright |
scrape.js |
Stealth Playwright browser + auto captcha solve | Slower (~10s) | playwright-extra, puppeteer-extra-plugin-stealth |
The Python script (scrape.py) tries the direct PDF download first. If that's captcha-blocked, it falls back to browser-based extraction via Scrapling/Playwright.
pip install -r requirements.txt
playwright install chromium
python scrape.py D-21RP-R1O5O-37npm install
npx playwright install chromium
node scrape.js D-21RP-R1O5O-37Both output JSON to stdout.
ISC