Skip to content

build(deps): bump chardet from 7.4.0.post1 to 7.4.3#430

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/chardet-7.4.3
Open

build(deps): bump chardet from 7.4.0.post1 to 7.4.3#430
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/chardet-7.4.3

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github Apr 20, 2026

Bumps chardet from 7.4.0.post1 to 7.4.3.

Release notes

Sourced from chardet's releases.

7.4.3

Patch release: fixes a crash when input contains null bytes inside a <meta charset> declaration.

Bug Fixes

  • Fixed ValueError: embedded null character crash when input contained a <meta charset> declaration with a null byte in the encoding name (e.g. b'<meta charset="\x00utf-8">'). codecs.lookup() raises ValueError on embedded nulls, and lookup_encoding() was only catching LookupError. Also added defensive ValueError catches in _validate_bytes() and _to_utf8() for completeness. (#369, thanks @​DRMacIver for the report)

Full Changelog: chardet/chardet@7.4.2...7.4.3

7.4.2

Patch release: fixes a crash on short inputs and closes a bunch of WHATWG/IANA alias gaps.

Bug Fixes

  • Fixed RuntimeError: pipeline must always return at least one result on ~2% of all possible two-byte inputs (e.g. b"\xf9\x92"). Multi-byte encodings like CP932 and Johab could score above the structural confidence threshold on very short inputs, but then statistical scoring would return nothing, leaving an empty result list instead of falling through to the fallback. (#367, #368, thanks @​jasonwbarnett)

Improvements

  • Added ~90 encoding aliases from the WHATWG Encoding Standard and IANA Character Sets registry so that <meta charset> labels like x-cp1252, x-sjis, dos-874, csUTF8, and the cswindows* family all resolve correctly through the markup detection stage. Every alias was driven by a failing spec-compliance test, not speculative. (#366)
  • Added a spec-compliance test suite covering Python decode round-trips for all 86 registry encodings, WHATWG label resolution, IANA preferred MIME names, and Unicode/RFC conformance (BOM sniffing, UTF-8 boundary cases, UTF-16 surrogate pairs). This is the test suite that would have caught the 7.4.1 BOM bug before release. (#366)

Full Changelog: chardet/chardet@7.4.1...7.4.2

7.4.1

Bug Fixes

  • BOM-prefixed UTF-16/32 input now returns utf-16/utf-32 instead of utf-16-le/utf-16-be/utf-32-le/utf-32-be. The endian-specific codecs don't strip the BOM on decode, so callers were getting a stray U+FEFF at the start of their text. BOM-less detection is unchanged. (#364, #365)

Full Changelog: chardet/chardet@7.4.0...7.4.1

Changelog

Sourced from chardet's changelog.

7.4.3 (2026-04-13)

Bug Fixes:

  • Fixed ValueError: embedded null character crash when input contained a <meta charset> declaration with a null byte in the encoding name (e.g. b'<meta charset="\x00utf-8">'). codecs.lookup() raises ValueError on embedded nulls, and lookup_encoding() was only catching LookupError. Also added defensive ValueError catches in _validate_bytes() and _to_utf8() for completeness. (Dan Blanchard <https://github.com/dan-blanchard>_ via Claude, [#369](https://github.com/chardet/chardet/issues/369) <https://github.com/chardet/chardet/issues/369>_)

7.4.2 (2026-04-12)

Bug Fixes:

  • Fixed RuntimeError: pipeline must always return at least one result on ~2% of all possible two-byte inputs (e.g. b"\xf9\x92"). Multi-byte encodings like CP932 and Johab could score above the structural confidence threshold on very short inputs, but then statistical scoring would return nothing, leaving the pipeline with an empty result list instead of falling through to the no_match_encoding fallback. (Jason Barnett <https://github.com/jasonwbarnett>_ via Claude, [#367](https://github.com/chardet/chardet/issues/367) <https://github.com/chardet/chardet/issues/367>, [#368](https://github.com/chardet/chardet/issues/368) <https://github.com/chardet/chardet/pull/368>)

Improvements:

  • Added ~90 encoding aliases from the WHATWG Encoding Standard and IANA Character Sets registry so that <meta charset> labels like x-cp1252, x-sjis, dos-874, csUTF8, and the cswindows* family all resolve correctly through the markup detection stage. Every alias was driven by a failing spec-compliance test. (Dan Blanchard <https://github.com/dan-blanchard>_ via Claude, [#366](https://github.com/chardet/chardet/issues/366) <https://github.com/chardet/chardet/pull/366>_)
  • Added a spec-compliance test suite covering Python decode round-trips for all 86 registry encodings, WHATWG web-platform label resolution, IANA preferred MIME names, and Unicode/RFC conformance (BOM sniffing, UTF-8 boundary cases, UTF-16 surrogate pairs). This is the test suite that would have caught the 7.4.1 BOM bug before release. (Dan Blanchard <https://github.com/dan-blanchard>_ via Claude, [#366](https://github.com/chardet/chardet/issues/366) <https://github.com/chardet/chardet/pull/366>_)

7.4.1 (2026-04-07)

... (truncated)

Commits
  • 8f404a5 docs: set 7.4.3 release date to 2026-04-13
  • 7a6667f docs: fix changelog attribution for #369
  • a1fc986 docs: changelog for 7.4.3
  • 0af01d6 Fix ValueError crash on null bytes in charset declarations (#369)
  • 08e4ebc ci: parallelize riscv64 builds across 5 RISE runners
  • 2f6e1e9 ci: use python3 -m pip on riscv64 runner
  • 204623d ci: invoke cibuildwheel manually on riscv64 runner
  • 78c1d20 ci: use native runners for aarch64/riscv64 instead of QEMU
  • 3cc0960 docs: changelog for 7.4.2
  • 9079efc Fix RuntimeError on ~2% of two-byte inputs (#368)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [chardet](https://github.com/chardet/chardet) from 7.4.0.post1 to 7.4.3.
- [Release notes](https://github.com/chardet/chardet/releases)
- [Changelog](https://github.com/chardet/chardet/blob/main/docs/changelog.rst)
- [Commits](chardet/chardet@7.4.0.post1...7.4.3)

---
updated-dependencies:
- dependency-name: chardet
  dependency-version: 7.4.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants