Releases: iipc/jwarc
Releases · iipc/jwarc
v0.28.0: Release 0.28.0
New features:
- Added fetch options to WarcWriter.fetch and fetch tool: maxTime, maxLength, readTimeout, userAgent
- Added fetch tool option --output-file
Bugs fixed:
- Fixed missing response.http().body().size() value when response is truncated and WarcReader.calculateBlockDigest() is enabled
v0.27.1: Release 0.27.1
Bugs fixed:
- Lenient HTTP parser now accepts folded header lines that use LF instead of CRLF
- Fixed bug where bogus ARC MIME field could be prepended to the length field
v0.27.0: Release 0.27.0
New features
- Added a HttpRequest.Builder(method, uri) constructor that populates the Host header.
Bugs fixed:
- WarcWriter.fetch(uri) was omitting the query string
Changes:
- ARC parser now accepts garbage in the MIME field
- HTTP parser in lenient mode now accepts messages without a minor version number (e.g. "HTTP/2") #70
v0.26.0: Release 0.26.0
New features
- CDX tool gained a --warc-full-path option to emits absolute paths in the filename field #76 (Thomas Egense)
- Added a CdxWriter class that provides a programmatic interface to the CDX indexing tool
v0.25.0: Release 0.25.0
New features
- CDX tool: New option -r or --revisits-included to include revisit records #75 (Thomas Egense)
v0.24.1: Release 0.24.1
Changes:
- Removed optional dependency on jackson-core. This was only used when processing JSON request bodies with the --post-append option of the CDX tool. jwarc now includes a small JSON tokenizer instead.
v0.24.0: Release 0.24.0
New features
- CDX tool gained a --digest-unchanged option to output the raw value of the WARC-Payload-Digest field #74 (Thomas Egense)
- MediaType and WarcDigest gained a .raw() method for accessing the original unparsed string
v0.23.1: Release 0.23.1
Bugs fixed:
- CdxRequestEncoder: Match pywb's 4096 character truncation correctly