Releases · internetarchive/brozzler

This release contains several new features along with enhancements for performance and reliability. It also contains several dependency updates, including yt-dlp 2025.12.08.

New behavior controls

There are several new features making behaviors more flexible, allowing navigating and gathering outlinks from more complex sites than in previous versions.

A new repeatUntilSelector control makes it possible to instruct behaviors when to stop looping, providing a more robust means to control when looping should stop early. (#431 - @mistydemeo)
It's now possible for custom behavior JavaScript to call the pre-defined outlink gathering function. (#429 - @mistydemeo)
It's now possible to gather outlinks while behaviors are running, which ensures that it's possible to gather outlinks from sites whose content changes at runtime like sites using JavaScript pagination. (#433/#434 - @mistydemeo)

Performance enhancements

Two new minor performance enhancements limit the default size of page screenshots (#420 - @vbanos) and reduces crawl startup time by caching the Chrome version (#424 - @vbanos).

Reliability enhancements

Brozzler now detects Chrome error pages and will retry the affected page. (#438 - @adam-miller)
try-login will now perform form validation before submitting and trigger input events. (#432/#440 - @netoarmando)
Improved error handling in retry loops. (#441 - @adam-miller)

Contributors

netoarmando, adam-miller, and 2 other contributors

Assets 2

14 Nov 00:50

mistydemeo

v1.8.1

c7f1794

v1.8.1

This is a bugfix release which fixes an issue which caused the max_sites_to_claim filter to be applied too early, resulting in claimable sites sometimes being skipped. (#422)

This release also drops support for Python 3.9; the new minimum version is Python 3.10.

Assets 2

07 Oct 20:43

mistydemeo

v1.8.0

4448458

v1.8.0

This release contains a number of new capture features. It also removes support for Python 3.8.

The new --disable-video-capture commandline flag allows excluding video captures in the brozzler-new-site commandline tool. (#379)
When video capture has been disabled, brozzler will no longer browse YouTube UMP packets. (#378)
Audio content types are now also skipped when video capture is disabled. (#380)
It's now possible to control whether Chrome is launched in headless mode using the --no-headless commandline flag and the headless keyword parameter in the Chrome.start method. The default hasn't changed. (#373)
Improved performance when visiting page anchors. (#394)
Improved compatibility when fetching robots.txt and page headers by enabling legacy renegotiation and disabling errors from unexpected SSL EOFs. (#411)
Fixed a bug which could break captures under certain specific circumstances when a page interstitial is encountered. (#375)
The header request timeout has been increased from 30 to 60 seconds. (#367)
Updated the versions of several dependencies.

Thanks to @TheTechRobo for the features in pull requests #373 and #394!

Contributors

TheTechRobo

Assets 2

16 Jun 21:46

mistydemeo

v1.7.0

0227da6

v1.7.0

Specifies additional extras when installing yt-dlp. The default group ensures that yt-dlp keeps its full feature set in future releases, and curl-cffi improves compatibility. (#366)
Adds new controls for capture - an option to capture only PDFs, and an option to exclude capturing video. (#288)

Assets 2

11 Jun 23:22

mistydemeo

v1.6.13

aadd9cd

v1.6.13

Updates Chrome config to disable the HttpsFirstBalancedModeAutoEnable feature in order to disable auto HTTP-to-HTTPS upgrades. (#355)

Assets 2

11 Jun 23:19

mistydemeo

v1.6.12

370638a

v1.6.12

Create new claim_sites() query (#353)

Assets 2

11 Jun 23:18

mistydemeo

v1.6.11

8b1d80f

v1.6.11

Updates yt-dlp config (#353)

Assets 2

Releases: internetarchive/brozzler

v1.9.3

Uh oh!

v1.9.2

Uh oh!

v1.9.1

Uh oh!

v1.9.0

New behavior controls

Performance enhancements

Reliability enhancements

Contributors

Uh oh!

v1.8.1

Uh oh!

v1.8.0

Contributors

Uh oh!

v1.7.0

Uh oh!

v1.6.13

Uh oh!

v1.6.12

Uh oh!

v1.6.11

Uh oh!