Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128965: pickle load_build function checks if state is None, not False #128966

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Legoclones
Copy link
Contributor

@Legoclones Legoclones commented Jan 17, 2025

Inside of the load_build() function for pickle's BUILD opcode, the C accelerator at one point checks if state is Py_None, while the Python version only checks if state.

if (state != Py_None) {

if state:

This means if state is something like an empty dictionary or tuple, the code block under the if statement WILL be run in _pickle.c, but NOT in pickle.py.

As an example, the bytestream b']]b.' has the following disassembly:

    0: ]    EMPTY_LIST
    1: ]    EMPTY_LIST
    2: b    BUILD
    3: .    STOP
highest protocol among opcodes = 1

This will do nothing in pickle.py but error out in _pickle.c with the message state is not a dictionary. The easy solution is to change if state to if state != None, and it shouldn't break any existing functionality. I've attached a pull request.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have tests that check this code path? Namely check that non-dict objects raise a TypeError and that falsey dicts do not raise exceptions? In addition, we need a NEWS entry for the bugfix, e.g:

Fix :meth:`!pickle._Pickler.load_build` for non-dictionary states.

(My NEWS entry is definitely poorly worded but I don't have a better suggestion for now. Maybe you can come up with a better formulation).

Legoclones and others added 3 commits January 21, 2025 16:59
Co-authored-by: Bénédikt Tran <[email protected]>
Added 6 tests where the `state` is falsey but not None, and the `inst` value is invalid - these should throw an AttributeError because the `__dict__` attribute doesn't exist for `inst`. In `pickle.py`, if `state` is falsey but not None code is never run to check that `inst` is a valid object (but it does happen in the C accelerator).
@Legoclones
Copy link
Contributor Author

@picnixz I went ahead and added some tests and a NEWS entry. The problem here isn't that load_build is not detecting non-dictionary states. The problem is that if the inst argument of load_build is not a valid object (aka, doesn't have the __dict__ attribute), then a falsey state attribute won't cause it error out in Python, but the C accelerator does. The C accelerator checks if state != None, but Python only checks if it's falsey. By changing the check in Python to state is not None, then when a falsey state argument is provided, inst is actually checked to see if it's a valid object and will error now because it's not.

@Legoclones
Copy link
Contributor Author

Okay, I'm not sure how to specify in the test that I want the Python version of the loads() function/Unpickler, not the C version. The loads() function of PyUnpicklerTests appears to use the Python version since it references pickle._Unpickler, but several of the tests were erroring out because they were using the C version (which throws an UnpicklingError instead of an AttributeError). Also, the C version was working fine, but the Python version is what's being changed here, so I want to make sure the tests are running on the Python version. What am I missing? I tried _loads but that doesn't exist in PyUnpicklerTests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants