Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128715: Expose ctypes.CField, with info attributes #128950

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

encukou
Copy link
Member

@encukou encukou commented Jan 17, 2025

Expose the type of struct/union field descriptors (_ctypes._CField) as ctypes.CField. (This is mainly to make it easier to refer to it -- for typing and documentation. Creating CFields manually is not supported.)

Add attributes for easier introspection:

  • name & type, as defined in _fields_
  • byte_size & byte_offset, specify the chunk of memory to read/write
  • bit_size & bit_offset, which specify the shift & mask for bitfields
  • is_bitfield & is_anonymous booleans

The existing offset remains, as an alias for byte_offset.
The existing size is unchanged. Usually it is the same as byte_size, but for bitfields, it contains a bit-packed combination of bit_size & bit_offset.

Update the repr() of CField.

Use the same values internally as well. (Except bit_size, which might overflow Py_ssize_t for very large types. Instead, in C use bitfield_size, which is 0 for non-bitfields and thus serves as is_bitfield flag. Different name used clarity.)
Old names are removed from the C implementation to ensure a clean transition.
For simplicity, I keep byte_size in CFieldObject for now, even though it's redundant: it's the size of the underlying type.

Add a generic test that ensures the values are consistent, and that we don't have overlapping fields in structs. Use this check throughout test_ctypes, wherever a potentially interesting Structure or Union is created. (Tests for how simple structs behave after they're created don't need the checks, but I erred on the side of adding checks.)

Lift the restriction on maximum field size that was temporarily added in GH-126938. The max size is now Py_ssize_t.

This PR does not yet touch cfield.c getters & setters: the bit-packed “size” argument is computed and passed to those.


📚 Documentation preview 📚: https://cpython-previews--128950.org.readthedocs.build/

@encukou encukou marked this pull request as draft January 31, 2025 16:47
@encukou
Copy link
Member Author

encukou commented Feb 7, 2025

Thank you for the review!
Here's more...

I spent two Fridays learning the details about big-endian bitfield layout. The result is in the new commits here.

The C standards use the term "addressable storage unit" for the memory that bitfields live in. I now use storage unit in the docs as well.
The bit_offset attribute is now defined as the bit shift amount (on a native integer) needed to get the bitfield value from the storage unit. (That's what ctypes used in size before; I hope it'll now be clear to anyone who comes after me.)

@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 7, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit 7ce3cb9 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F30617%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 7, 2025
@encukou
Copy link
Member Author

encukou commented Feb 7, 2025

I'll continue next week :)

- Add alignment requirement
- Mention that ob_size is unreliable if you don't control it
- Add some links for context
- basicsize should include the base type in generaly not just PyObject

This adds a “by-the-way” link to `PyObject_New`, which shouldn't be
used for GC types. In order to be comfortable linking to it, I also
add a link to `PyObject_GC_New` from its docs. And the same for
`*Var` variants, while I'm here.
@picnixz
Copy link
Member

picnixz commented Feb 8, 2025

Should I review what you wrote now or do you prefer me to wait?

@encukou
Copy link
Member Author

encukou commented Feb 21, 2025

Sorry, I missed the comment! I'd appreciate a review.

The failure on 32-bit Debian is due to an existing bug, but such an exotic one that it's not worth fixing now: #130410
(It shows that the tests work, though!)

@encukou encukou marked this pull request as ready for review February 25, 2025 12:39
@encukou
Copy link
Member Author

encukou commented Mar 10, 2025

@picnixz, should I wait for your review?
No pressure, feel free to say no :)

@picnixz
Copy link
Member

picnixz commented Mar 10, 2025

Oh I forgot... I'll have maybe a bit of time before taking my plane or I'll be back on Wednesday. If you want to wait until then, I'll review it otherwise just go ahead!

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments (hard to do a better review on a phone)



class TestAlignedStructures(unittest.TestCase):
class TestAlignedStructures(unittest.TestCase, StructCheckMixin):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure but shouldn't mixin classes appear before non-mixin ones?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the methods/attributes don't overlap, it doesn't matter.

@@ -56,51 +55,41 @@ Py_ssize_t LOW_BIT(Py_ssize_t offset);
@classmethod
_ctypes.CField.__new__ as PyCField_new

*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the name could be non-keyword?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that when it's exposed as a public API.

Comment on lines +428 to +429
(Py_ssize_t)field->bitfield_size,
(Py_ssize_t)field->bit_offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it usually preferrable to have a cast and use zd or directly use PRI8u?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I don't know. Py_ssize_t seems more readable to me.

Here, it's an intentional (but somewhat obscure) choice -- these are sizes; they're just packed rather aggressively. Before there's a dedicated CBitField subclass, I don't want to waste memory on bitfield data.

Copy link
Member Author

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review!



class TestAlignedStructures(unittest.TestCase):
class TestAlignedStructures(unittest.TestCase, StructCheckMixin):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the methods/attributes don't overlap, it doesn't matter.

@@ -56,51 +55,41 @@ Py_ssize_t LOW_BIT(Py_ssize_t offset);
@classmethod
_ctypes.CField.__new__ as PyCField_new

*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that when it's exposed as a public API.

name, byte_size);
}
bitfield_size = PyLong_AsSsize_t(bit_size_obj);
if ((bitfield_size <= 0) || (bitfield_size > 255)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't trust C's operator precedence :)

Comment on lines +428 to +429
(Py_ssize_t)field->bitfield_size,
(Py_ssize_t)field->bit_offset);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I don't know. Py_ssize_t seems more readable to me.

Here, it's an intentional (but somewhat obscure) choice -- these are sizes; they're just packed rather aggressively. Before there's a dedicated CBitField subclass, I don't want to waste memory on bitfield data.

@@ -110,17 +99,12 @@ PyCField_new_impl(PyTypeObject *type, PyObject *name, PyObject *proto,
"type of field %R must be a C type", self->name);
Copy link
Contributor

@sergey-miryanov sergey-miryanov Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point self is NULL. I think should be just name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should convert name to an exact Unicode object first though but not bind it to self yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh, thanks!

%R on the input is fine for error messages -- the user is getting what they asked for.

if (!self) {
return NULL;
}
self->name = PyUnicode_FromObject(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@picnixz Eah, move this to the top and add a cleanup on error, if I get you right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants