Skip to content

Conversation

@andreer
Copy link
Contributor

@andreer andreer commented Dec 18, 2025

Changes

Add readahead buffer to C decoder to improve performance by avoiding many small reads.

I've not added test or updated docs yet, will do later if this makes sense.

Checklist

If this is a user-facing code change, like a bugfix or a new feature, please ensure that
you've fulfilled the following conditions (where applicable):

  • You've added tests (in tests/) which would fail without your patch
  • You've updated the documentation (in docs/), in case of behavior changes or new
    features
  • You've added a new changelog entry (in docs/versionhistory.rst).

If this is a trivial change, like a typo fix or a code reformatting, then you can ignore
these instructions.

Updating the changelog

If there are no entries after the last release, use **UNRELEASED** as the version.
If, say, your patch fixes issue #123, the entry should look like this:

- Fix big bad boo-boo in the encoder
  (`#123 <https://github.com/agronholm/cbor2/issues/123>`_; PR by @yourgithubaccount)

If there's no issue linked, just link to your pull request instead by updating the
changelog after you've created the PR.

@coveralls
Copy link

coveralls commented Dec 18, 2025

Coverage Status

coverage: 94.58%. remained the same
when pulling 24f0714 on andreer:master
into 0bcf400 on agronholm:master.

@agronholm
Copy link
Owner

I think a changelog entry should be added for this.

Copy link
Owner

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one change request.

@andreer
Copy link
Contributor Author

andreer commented Dec 21, 2025

On further inspection I'm also introducing another bug here in the case where decode_from_bytes is called on tag data during decoding, as described in the documentation:

cbor2/docs/customizing.rst

Lines 99 to 112 in 403c2ce

def tag_hook(decoder, tag, shareable_index=None):
# Return all other tags as-is
if tag.tag != 3000:
return tag
# Create a raw instance before initializing its state to make it possible for cyclic
# references to work
instance = MyType.__new__(MyType)
decoder.set_shareable(shareable_index, instance)
# Separately decode the state of the new object and then apply it
state = decoder.decode_from_bytes(tag.value)
instance.__dict__.update(state)
return instance

Currently I believe this will overwrite the readahead buffer, losing bytes. Will need another look.

@agronholm
Copy link
Owner

Is it good for another round now?

@andreer
Copy link
Contributor Author

andreer commented Dec 26, 2025

Ugh, a little bit of shotgun coding there. Since that failed twice I'd like to review it again more thoroughly tomorrow before adding a changelog entry.

- configurable readahead buffer (default 4KB) to reduce Python read() calls
- add read_size parameter to CBORDecoder.__init__
- per-instance buffers for thread-safety
@agronholm
Copy link
Owner

Please don't force push changes. It makes it hard for me to track changes you've made.

@andreer
Copy link
Contributor Author

andreer commented Dec 27, 2025

Ok, now I've convinced myself I understand what the issue was and that it's properly fixed. I ended up refactoring a bit, there was a lot of back and forth in the commits so I thought it made most sense to squash it. The diff should be smaller and hopefully easier to review now.

@agronholm
Copy link
Owner

Now I'll have to re-read everything, as I have no clue what the problem was before or what you did to fix it.

@andreer
Copy link
Contributor Author

andreer commented Dec 27, 2025

Ah, sorry about that. The un-squashed commit history is here: https://github.com/andreer/cbor2/commits/unsquashed/

The issue I referred to was a use-after-free which was fixed in andreer@559d387

@agronholm
Copy link
Owner

Thanks. I'll give this a new review today (Sunday).

Copy link
Owner

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any problems. Looks good! Thanks.

@agronholm agronholm merged commit fb4ee16 into agronholm:master Dec 29, 2025
14 checks passed
agronholm added a commit that referenced this pull request Dec 30, 2025
@andreer
Copy link
Contributor Author

andreer commented Dec 31, 2025

Thank you, and happy new year!

@Pastea
Copy link

Pastea commented Jan 7, 2026

Hi, this PR was linked to the CVE-2025-68131 but I think that this is incorrect, could you please confirm if instead the correct fixing commit is this one f1d701cd2c411ee40bb1fe383afe7f365f35abf0? Thanks

@agronholm
Copy link
Owner

Hi, this PR was linked to the CVE-2025-68131 but I think that this is incorrect, could you please confirm if instead the correct fixing commit is this one f1d701cd2c411ee40bb1fe383afe7f365f35abf0? Thanks

Indeed. I don't know why this PR was linked to that CVE, or how to control what PR is linked if at all.

@Pastea
Copy link

Pastea commented Jan 7, 2026

I think that it takes data from the references inside the github advisory, just created a PR to adjust it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants