v0.7.0
New features
- jwarc now includes a simple filter language for selecting matching WARC records.
jwarc filter 'warc-type != "request"'
jwarc filter ':status == 200 && http:content-type =~ "image/.*"'
long errors = reader.records().filter(WarcFilter.compile(":status >= 400")).count();
- Native binary builds of the jwarc CLI tool are now available for Linux and MacOS. These are built using GraalVM and do not require Java to be installed. (The cross-platform .jar is still the recommended version though.)
Changed
- Calling record.http() no longer invalidates record.body() although care must still be taken.
- Remove the HttpParser.Handler interface