Skip to content

Failing test for std::weak_ptr used with derived Python class and py::smart_holder #5624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

nsoblath
Copy link

Description

PR is in support of issue #5623.

The added test shows that if you hold a reference to a derived Python object with a C++ std::weak_ptr, and the relevant classes use py::smart_holder, the weak_ptr will expire even if the Python object still exists.

virtual ~WpBase() = default;
};

struct PyWpBase : WpBase {
Copy link
Collaborator

@rwgk rwgk Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct PyWpBase : WpBase, py::trampoline_self_life_support {

Does that help?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I just fixed an oversight in my previous comment.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct PyWpBase : WpBase {
struct PyWpBase : WpBase, py::trampoline_self_life_support {

@rwgk
Copy link
Collaborator

rwgk commented Apr 19, 2025

I simplified the reproducer and changed the variable names to make them more intuitive (to me).

I'm guessing we're building an intermediate shared_ptr<VirtBase> that has a different control block than the shared_ptr held by the smart_holder.

I wrote that code four years ago. Quite likely this will take me several hours to re-learn what exactly is happening. Not sure when I'll get to it.

@nsoblath How did you arrive at that test? Did you have code that worked with std::shared_ptr as holder?

…s holder). Tweak test_class_sh_trampoline_weak_ptr.py to pass, with `# THIS NEEDS FIXING` comment.
@rwgk
Copy link
Collaborator

rwgk commented Apr 19, 2025

@nsoblath How did you arrive at that test? Did you have code that worked with std::shared_ptr as holder?

On that suspicion, I added test_class_sp_trampoline_weak_ptr.cpp,py, and it seems to work: it certainly works locally, but I'm waiting for GitHub Actions to finish.

Assuming the new test works reliably on all platforms, I'd guess that it's feasible to fix the smart_holder equivalent.

rwgk added 3 commits April 19, 2025 13:39
```
/__w/pybind11/pybind11/tests/test_class_sh_trampoline_weak_ptr.cpp:23:43: error: the parameter 'sp' is copied for each invocation but only used as a const reference; consider making it a const reference [performance-unnecessary-value-param,-warnings-as-errors]
   23 |     void set_wp(std::shared_ptr<VirtBase> sp) { wp = sp; }
      |                                           ^
      |                 const                    &
4443 warnings generated.
```

```
/__w/pybind11/pybind11/tests/test_class_sp_trampoline_weak_ptr.cpp:23:43: error: the parameter 'sp' is copied for each invocation but only used as a const reference; consider making it a const reference [performance-unnecessary-value-param,-warnings-as-errors]
   23 |     void set_wp(std::shared_ptr<VirtBase> sp) { wp = sp; }
      |                                           ^
      |                 const                    &
4430 warnings generated.
```
@rwgk
Copy link
Collaborator

rwgk commented Apr 19, 2025

Before I wrote:

I'm guessing we're building an intermediate shared_ptr<VirtBase> that has a different control block than the shared_ptr held by the smart_holder.

That's here (conclusive):

Note that the smart_holder instance is readily available in the same context:

hld.vptr is the shared_ptr instance with the control block that we need in order to fix #5623.

So in line 782 we need to return a shared_ptr that

  • has the same control block as hld.vptr
  • but also owns a Python reference count for loaded_v_h.inst

Any ideas, please let me know. (I haven't thought about it much yet; the penny didn't want to drop immediately.)

@virtuald for visibility

@nsoblath
Copy link
Author

@nsoblath How did you arrive at that test? Did you have code that worked with std::shared_ptr as holder?

I copied the test_class_sh_trampoline_shared_ptr_cpp_arg test, modified it to store a std::weak_ptr and stripped out everything that wasn't needed. It replicates the essential parts of the application I have that's not working. I didn't try using std::shared_ptr because the application I have also needs the benefits of py::smart_holder.

@rwgk
Copy link
Collaborator

rwgk commented Apr 20, 2025

I've now turned this over in my mind for a couple hours, and consulted ChatGPT, which I believe got very confused/confusing, but please see for yourself here.

Stepping out of the box:

  • The code I pointed out before is pulling a trick to avoid inheritance slicing ([BUG] Problem when creating derived Python objects from C++ (inheritance slicing) #1333), by keeping the derived Python object alive as long as the returned shared_ptr exists.

  • But you don't actually want a shared_ptr, you actually want a weak_ptr, which isn't meant to keep anything alive.

  • So in your particular case, the trick to avoid inheritance slicing backfires. It fixes one problem, but creates another one.

Solving this is a real brain teaser. I'm not sure I can help. I'm not sure if there is a viable solution. I have a couple ideas, but each of them might take hours of work only to find out it leads to a dead end.

Could you try to work around this limitation? For example, instead of owning the C++ weak_ptr, own a Python weakref instead (in WpOwner). Dereference the Python weakref to see if the Python object is still alive, then you can safely extract the C++ pointer if it is (in WpOwner::get_code).

rwgk added 3 commits April 20, 2025 10:54
…s does not inherit from this class, preserve established Inheritance Slicing behavior.

rwgk reached this point with the help of ChatGPT:

* https://chatgpt.com/share/68056498-7d94-8008-8ff0-232e2aba451c

The only production code change in this commit is:

```
diff --git a/include/pybind11/detail/type_caster_base.h b/include/pybind11/detail/type_caster_base.h
index d4f9a41e..f3d45301 100644
--- a/include/pybind11/detail/type_caster_base.h
+++ b/include/pybind11/detail/type_caster_base.h
@@ -776,6 +776,14 @@ struct load_helper : value_and_holder_helper {
                 if (released_ptr) {
                     return std::shared_ptr<T>(released_ptr, type_raw_ptr);
                 }
+                auto *self_life_support
+                    = dynamic_raw_ptr_cast_if_possible<trampoline_self_life_support>(type_raw_ptr);
+                if (self_life_support == nullptr) {
+                    std::shared_ptr<void> void_shd_ptr = hld.template as_shared_ptr<void>();
+                    std::shared_ptr<T> to_be_released(void_shd_ptr, type_raw_ptr);
+                    vptr_gd_ptr->released_ptr = to_be_released;
+                    return to_be_released;
+                }
                 std::shared_ptr<T> to_be_released(
                     type_raw_ptr, shared_ptr_trampoline_self_life_support(loaded_v_h.inst));
                 vptr_gd_ptr->released_ptr = to_be_released;
```
@rwgk
Copy link
Collaborator

rwgk commented Apr 20, 2025

I'm testing commit 4638e01 to see if that works on all platforms.

See https://chatgpt.com/share/68056498-7d94-8008-8ff0-232e2aba451c for how I got there / rationale.

I'm not sure if 4638e01 is a wise twist to add.

Fundamentally, it'd be better to support std::weak_ptr conversions more directly, but that's certainly a lot more work than I'm able to invest into pybind11.

```
    11>D:\a\pybind11\pybind11\tests\test_class_sp_trampoline_weak_ptr.cpp(44,50): error C2220: the following warning is treated as an error [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
    11>D:\a\pybind11\pybind11\tests\test_class_sp_trampoline_weak_ptr.cpp(44,50): warning C4458: declaration of 'sp' hides class member [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
             D:\a\pybind11\pybind11\tests\test_class_sp_trampoline_weak_ptr.cpp(54,31):
             see declaration of 'pybind11_tests::class_sp_trampoline_weak_ptr::SpOwner::sp'
```
@rwgk
Copy link
Collaborator

rwgk commented Apr 21, 2025

@nsoblath I feel I'm pretty clear now about the situation. I think it's best not to change the smart_holder code, but to go with the approach outlined before:

instead of owning the C++ weak_ptr, own a Python weakref instead (in WpOwner). Dereference the Python weakref to see if the Python object is still alive, then you can safely extract the C++ pointer if it is (in WpOwner::get_code).

I'll explain more when I get to it, probably next weekend.

(The only thing I want to change, probably, is to add a static_assert, to enforce that py::trampoline_self_life_support is used if the holder is smart_holder, and not used otherwise. This doesn't change anything at runtime, but is to avoid leaving users unclear.)

@nsoblath
Copy link
Author

@rwgk Thank you very much for your time and attention looking into this. I really appreciate it.

Though my understanding of the internals of pybind11 is quite incomplete, my thoughts about what might be going wrong were similar to what you suggested: that the control block for the shared_ptr<VirtBase> that's extracted is different from the one held by the smart_holder. Do I understand correctly that the change you added in 4638e01 fixes this problem? That's what it looks like since the test_class_sh_trampoline_weak_ptr tests now pass.

In your later message you said

I think it's best not to change the smart_holder code, . . .

Does that mean you're preferring to not use the change in 4638e01? Is there a downside to making sure that the control block of the returned shared_ptr matches the one held by the smart_holder in this particular situation?

Regarding the alternative solution you proposed:

instead of owning the C++ weak_ptr, own a Python weakref instead (in WpOwner). Dereference the Python weakref to see if the Python object is still alive, then you can safely extract the C++ pointer if it is (in WpOwner::get_code).

I believe I can implement something to this effect. It's not ideal because the library where I'm using this is sometimes used as a pure-C++ library, sometimes used with its Python bindings. WpOwner (which in reality holds a collection of weak_ptrs) can simultaneously hold pointers to pure C++ classes and Python classes. When the library is used without Python bindings, I don't want to require the user to also have Python and Pybind11 installed. However, I can probably make a struct that combines weak_ptr and weakref and handles the source differences between when the Python bindings are used and when they're not.

@rwgk
Copy link
Collaborator

rwgk commented Apr 21, 2025

Does that mean you're preferring to not use the change in 4638e01?

Yes.

Is there a downside to making sure that the control block of the returned shared_ptr matches the one held by the smart_holder in this particular situation?

Yes, it'll be very confusing to many (I think), while only being useful in a few niche cases.

It is not obvious at all that the shared_ptr that works for you is subject to inheritance slicing.

I feel it'll be better to keep the smart_holder feature simpler and the behavior consistent. It will be easier to reason about the bindings.

We could probably add an interface that makes it relatively easy for you to get the inheritance-slicing-shared_ptr from the Python object, which you could use from a lambda function in the pybind11 bindings.

What I didn't explain: I convinced myself that it is impossible to get a shared_ptr that does both:

  • Keeps the PyDrvd alive (in other words, isn't subject to inheritance slicing).
  • Has the same control block as the shared_ptr inside the holder.

Reason: This would introduce a hard reference cycle, because the Python object (PyDrvd) owns the holder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants