Releases: ail-project/lacus
Pass The Salt 2023
New features
- Support for HTTP credentials, viewport, geolocation, timezone locale and color scheme in the capture settings
- Global proxy settings
- Daily statistics (visible in the monitoring)
Changes
- Store the capture results in a compressed pickle instead of JSON, reduces the RAM use by a lot
- Improve documentation, option in the update script to only install and not start the system
Bugfixes
- Very annoying crash when the headers were improperly set causing the captures to fail
- Many bugfixes and improvments in PlaywrightCapture and LacusCore
Spring cleanup
This release is the outcome of a whole lot of changes in the last few months across PlaywrightCapture and LacusCore.
New features
- [Lacus] Use new async mechanism from LacusCore, improves long running processes
- [Lacus] Allow to stop the capturing process cleanly, do not kill ongoing captures until they're done or we reach a timeout
Changes
- [PlaywrightCapture] Major improvement in exceptions handling and reporting
- [PlaywrightCapture] Major improvements in logging mechanism, mostly avoiding logging irrelevant things, or flagging as warning unimportant things
- [PlaywrightCapture] Properly bubble up to LacusCore a suggestion to retry the capture
- [PlaywrightCapture] Improve handling of timeout for captures taking too long, log that fact
- [PlaywrightCapture] Use more recent version of Playwright
- [PlaywrightCapture] Sanitize HTTP headers
- [PlaywrightCapture] Much better handling of exceptions raised by the browsers, avoid waiting until we reach the timeout if the capture already failed
- [LacusCore] Major improvements when using async
- [LacusCore] Major improvement in exceptions handling and reporting
- [LacusCore] Major improvements in logging mechanism, mostly avoiding logging irrelevant things, or flagging as warning unimportant things
- [LacusCore] Properly handle hints from PlaywrightCapture regarding retries
- [LacusCore] Improve handling of timeouts, and raising more if something gets stuck
- [Lacus] Improve handling of dead processes, cleanup
- [Lacus] Improve tracking of ongoing captures
- [Lacus] Improve logging
Bugfixes
- [PlaywrightCapture] Do not keep HTTP headers (especially broken ones) in the session for subsequent captures
- [LacusCore] Properly handle hints from PlaywrightCapture regarding retries
Late february release
There were very few changes in Lacus itself. Most of the changes happened in LacusCore and PlaywrightCapture.
HoHoHoliday season release
Breaking change
Poetry v1.3.0 or more recent is now required, please upgrade to the latest version.
New Features
- Always enable reCAPTCHA bypass (doesn't always work, but we do what we can)
- Script to stop capture_manager cleanly (waits for the ongoing captures to finish before stopping)
Bugfixes
- Better exceptions handling in Lacus, LacusCore and PlaywrightCapture
- [LacusCore] Block captures of files on disk unless authorized
- [PlaywrightCapture] Improve timeout handling
- [PlaywrightCapture] Improve reCAPTCHA solving
Changes
- Update Flask and flask-restx, all other dependencies.
- [PlaywrightCapture] Update Playwright, adapt capture script accordingly
- [LacusCore] Optionally submit a capture with a pre-defined UUID
- Major logging improvements in Lacus, LacusCore and PlaywrightCapture
Late october cleanup
New features
- Allow to cleanly kill the capture process independently from the other:
pkill -15 capture_manager - Add runtime for the captures (logs and response)
Initial release - CTI Summit 2022
Most of this project was used in production in Lookyloo and we decided to export it as a standalone webservice so it can be used by other systems more easily.
In short, you submit a URL or a web enabled document (optionally with some parameters), Lacus will capture the URL (or render it if it is a document), and you get the capture back to use in your own system. If all went well, the response contains amongst other a HAR file, a screenshot of the page, the rendered HTML content, and the complete cookie jar of the browser.
The capture is done using Playwright via PlaywrightCapture.
Features
- Uses up-to-date browsers (chromium, firefox, and webkit) as provided by playwright
- Webpage instrumentation provided by PlaywrightCapture, including scrolling on the page after it is completely loaded, try to bypass ReCaptcha, grant browser permissions...
- If the URL points to a downloadable file, it will be downloaded and present in the response.
- Supports SOCKS5 proxies passed along the capture, and
.onionURLs are automatically captured via tor (as long as the tor proxy is reachable). Possible to force a non-onion capture to be done via tor too. - Configurable capture depth based on URLs that can be found on the rendering page (same hostname or not). This feature is to use with caution as it might trigger a lot or subsequent captures and may take a very long time.
- De-duplication of the captures with LacusCore: if the same capture settings are submitted multiple times within a specific interval (configurable), the user receive the UUID of the previous submission. It can also be ignored.
- On submission, it is possible to give a priority to a specific capture to be triggered before the others.
- PyLacus: Python module to submit captures, check their status and fetch the response