Initial release - CTI Summit 2022
Most of this project was used in production in Lookyloo and we decided to export it as a standalone webservice so it can be used by other systems more easily.
In short, you submit a URL or a web enabled document (optionally with some parameters), Lacus will capture the URL (or render it if it is a document), and you get the capture back to use in your own system. If all went well, the response contains amongst other a HAR file, a screenshot of the page, the rendered HTML content, and the complete cookie jar of the browser.
The capture is done using Playwright via PlaywrightCapture.
Features
- Uses up-to-date browsers (chromium, firefox, and webkit) as provided by playwright
- Webpage instrumentation provided by PlaywrightCapture, including scrolling on the page after it is completely loaded, try to bypass ReCaptcha, grant browser permissions...
- If the URL points to a downloadable file, it will be downloaded and present in the response.
- Supports SOCKS5 proxies passed along the capture, and
.onionURLs are automatically captured via tor (as long as the tor proxy is reachable). Possible to force a non-onion capture to be done via tor too. - Configurable capture depth based on URLs that can be found on the rendering page (same hostname or not). This feature is to use with caution as it might trigger a lot or subsequent captures and may take a very long time.
- De-duplication of the captures with LacusCore: if the same capture settings are submitted multiple times within a specific interval (configurable), the user receive the UUID of the previous submission. It can also be ignored.
- On submission, it is possible to give a priority to a specific capture to be triggered before the others.
- PyLacus: Python module to submit captures, check their status and fetch the response