https://www.vermittlerregister.info/recherche?a=pdf®isternummer=D-21RP-R1O5O-37
Build a Python script that:
- Accepts an IHK registernummer (e.g. D-21RP-R1O5O-37) as input
- Returns broker details from the Vermittlerregister
- Works consistently and fast (seconds, not minutes)
- Does NOT require manual captcha solving
The site uses Friendly Captcha (https://friendlycaptcha.com). This is a PoW-based captcha (not image-based), so:
- It's a puzzle solved client-side in JS before form submit
- The server validates a puzzle solution token submitted with the form
The URL has a=pdf param — test if a direct HTTP GET/POST returns data without captcha validation.
Try:
curl -L "https://www.vermittlerregister.info/recherche?a=pdf®isternummer=D-21RP-R1O5O-37" -o test.pdf
Also try POST:
curl -X POST "https://www.vermittlerregister.info/recherche" -d "a=pdf®isternummer=D-21RP-R1O5O-37" -o test2.pdf
Use Playwright or Scrapling to intercept the actual form submission and find what tokens/headers are sent. Replay those.
Check if Scrapling (https://github.com/D4Vinci/Scrapling) can handle this. Install and test it.
pip install scrapling
Friendly Captcha uses a hashcash-style PoW puzzle. Research if there's an open-source solver that can complete the puzzle programmatically (no human needed).
scrape.py— main script, takes registernummer as CLI arg, prints JSON with broker detailsREADME.md— how it works, what was found- Commit everything
When done, run: openclaw system event --text "Done: vermittler-scraper ready. Check ~/Code/vermittler-scraper/" --mode now