Skip to content

Support S7 extension of S4 classes and vice-versa#659

Open
lawremi wants to merge 72 commits into
mainfrom
issue-456-s4-extends-s7
Open

Support S7 extension of S4 classes and vice-versa#659
lawremi wants to merge 72 commits into
mainfrom
issue-456-s4-extends-s7

Conversation

@lawremi

@lawremi lawremi commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Fixes #456.

First iteration on enabling an S7 class to extend an S4 class and vice-versa.

The concrete, fail-fast goal is to enable SummarizedExperiment, a complex, central package in Bioconductor, to be rewritten in S7, based on S4 classes defined lower in the stack, like those from S4Vectors and IRanges, while not breaking compatibility with dependent packages, like SingleCellExperiment, via S4 shims extending the S7 classes. SingleCellExperiment is complex and deeply dependent on SummarizedExperiment, so if it continues working, it is a very good indicator.

What has been achieved:

  • Fixed a bug in the methods package related to upcasting S4 objects derived old classes. Should be part of the next patch release, but it will cause release failures in the checks.
  • Modified S4Vectors devel branch to use a class1() helper to be explicit when it wants the scalar name of the object's class and to ensure that it is robust to S3-style class vectors. Also improved the setValidity2() helper so that validity functions are enclosed in the S4Vectors namespace, enabling S4 class objects to remain identical across lazy loading. This is important since S4 class objects end up in the S7 class object, which is carried by S7 objects, and we want them to pass identity checks across packages.
  • Rewrote SummarizedExperiment (branch) to use S7 classes, S7 generics and S7 syntax for defining methods on upstream S4 generics. Also defined S4 classes with the original names that derive from their corresponding S7 classes as a compatibility layer. It passes its tests, which were ported to S7 while preserving the original logic.
  • Ensured SingleCellExperiment passes tests without any modification.
  • The path to Bioconductor adoption of S7 is now wide open.

For S7 classes extending S4 classes, the current approach is to instantiate S7 objects, meaning that the S4 bit is not set. The justification is that there are S4 object invariants that an S7 object cannot satisfy, and vice versa. The big one being the return value of class(). Instead we have an S7 shim that does a reasonable job of mimicking an S4 object, including slot access, validity checking, and initialization, but not class() compatibility. See the evolving man page insert for more details.

To extend an S7 class using setClass(), it is necessary to first register an S4 shim using S4_register_contains(). The shim is an ordinary S4 class that derives from the old class representation of the S7 class, which is automatically created by S4_register() if needed. It defines the S7 properties, as well as the S7_class attribute, as slots. This is necessary so that the automatically defined upcast coercions, used for example in validObject(), preserve the attributes S7 requires. Any instances of an S7-derived S4 class are proper S4 objects, with the bit set and a scalar class vector.

Besides adding and changing hundreds of lines of logic around S4 classes, significant changes to support this in S7 include:

  • Made S7 property storage compatible with S4 NULL slot sentinels. Until now, setting an S7 property to NULL removed the attribute, and S4 does not like it when its slots disappear from the attributes.
  • The inheritance checks in prop.c need to be robust to scalar class vectors, using @.S3Class instead.
  • Added S4 method registration for internal generics when S7 method signatures include classes with S4 ancestry. S4 methods on internal generics take precedence over S3 methods, so when S7 defines a method on an internal generic, it should define an S4 method if any of the classes in the signature are S4-derived classes, since otherwise it might be overridden by an S4 method defined for a parent class. S7 must also support multi-dispatch methods on internal generics where S4 supports it. The dimnames<-() generic is an example.
  • @<-.S7_object() needs to support slots defined on S4 children that are not S7 properties, because S4 classes that extend S7_object will dispatch to the method, even when the S4 bit is set.

Resolving the two incompatible class() return values, scalar vs. non-scalar, is a larger issue. My proposal is that we export methods:::.class1() as methods::class1() and encourage all S4 code to use class1() to make the expectation of a scalar return value explicit. Practically, we could update the core of Bioconductor to adhere to that policy and thus enable new Bioconductor packages to extend the infrastructure with S7.

@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch 6 times, most recently from 87b33e2 to 5e5d36d Compare June 3, 2026 00:26
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 5e5d36d to 3650492 Compare June 7, 2026 20:14
@lawremi lawremi changed the title Support S7 extension of S4 classes Support S7 extension of S4 classes and vice-versa Jun 7, 2026
lawremi added 22 commits June 7, 2026 14:40
…cing to an S3 class, and use methods::extends() to capture the entire class vector, not just the first class
…@() behavior, which needs to change in base R.
…S4_register() can now communicate any S7 class structure through the S4class= argument.
…t initialize() except uses the S7 property setting path.
…ot really slots, they're properties; drop the initialize() method for the same reason, and instead use setValidity() ensure S4 code properly checks for validity. This also catches the odd case where new() is used on an old class, like with new(class(x)), creating an object that should probably not exist.
… the methods package, will crash and burn on class vectors of length > 1
lawremi added 25 commits June 7, 2026 14:40
…m S7), we need to use R_do_slot_assign so that a NULL value does not delete the S4 slot.
…S4 object is just that, even if it inherits from an S7 class and thus carries an S7_class slot.
…ility with S4; arguably a generally better approach since it avoids adding and deleting attributes based on whether the value is NULL or not.
…ults. We evaluate these if they are language objects, so S7 classes extending or being extended by S4 classes need to ensure that evaluation can happen at build time. Also mark it VIRTUAL because it should never be constructed directly.
… an S4 instance of that class (old classes are virtual normally but not in this case since we pass an S4Class prototype)
…ic is internal and an element of the signature has an S4 ancestor. This is needed because internal generics will favor S4 methods on S4 objects, so there is potential for inheriting overrides.
…ary S4 classes instead of old classes, because an old class implies an S3 instance of the object can exist, ie, there can be an S3Part().
… it to get stripped during S4 upcasting, breaking the S7 object
…in principle, there should only be one ::S4Slots shim for each S4 derivative of an S7 class
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 3650492 to eb1092b Compare June 7, 2026 21:40
@lawremi lawremi requested review from hadley and t-kalinowski June 7, 2026 21:42

@hadley hadley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you know infinitely more about S4 than me 😄 so I focussed on docs and style.

Also need an update to NEWS.md?

@@ -0,0 +1,63 @@
`new_class()` can use an S4 class as its parent. The S4 class can be supplied

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would probably be better off in vignettes/compatibility.Rmd. We're not super consistent on where we put these docs, but I think longer form stuff is just easier to read in vignettes, and then it's next to the S3 equivalent.

methods::is(Child(x = 1, y = "a"), "Parent")
```

Things that work reasonably:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Things that work reasonably:
Things that work reasonably well:

old-class object, not as an S4-bit object, so `isS4(child)` is `FALSE`.
This avoids advertising full S4 object invariants that the object does not
satisfy.
* The S3 class vector is not scalar. S4 code that needs one primary class name

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is this use of scalar class()?

`S7_object` through the old-class graph while lacking the `S7_class` slot or
attribute that makes it an S7 object. Code that updates an existing object
should prefer `methods::initialize(x, ...)`.
* S4 methods that call `methods::slot<-()` can bypass S7 property setters and

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent with the design of S4 right, i.e. where you are generally responsible for validation after directly modifying a slot?

Comment on lines +10 to +13
`S4_register_contains()` is for the opposite direction: use it when an S4 class
needs to extend an S7 class with `methods::setClass(contains = )`. It registers
the S7 class if needed, then creates and returns an S4 shim class whose name
ends in `::S4Slots`. The shim:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`S4_register_contains()` is for the opposite direction: use it when an S4 class
needs to extend an S7 class with `methods::setClass(contains = )`. It registers
the S7 class if needed, then creates and returns an S4 shim class whose name
ends in `::S4Slots`. The shim:
`S4_register_contains()` is for inheritance: use it when an S4 class
should be allowed to extend an S7 class with `methods::setClass(contains = )`. It registers
the S7 class if needed, then creates and returns an S4 shim class whose name
ends in `::S4Slots`. The shim:

?

Comment thread R/method-register.R
# Accept a bare list of length 1 too, for symmetry with multi-dispatch
# generics where a list is required (#555).
if (is.list(signature) && !is.object(signature) && length(signature) == 1) {
if (is_multi_arg_signature(signature) && length(signature) == 1) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the length check should go in the helper too?

Comment thread R/method-register.R
}
}

is_multi_arg_signature <- function(signature) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment thread R/S4.R
#' S4Foo_S4 <- S4_register_contains(S4Foo)
#' methods::setClass("S4Child", contains = S4Foo_S4)
S4_register <- function(class, env = parent.frame()) {
if (is_class(class)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a bit clearer if we moved all the early returns first?

parentS4 <- methods::setClass("parentS4", slots = c(x = "numeric"))
expect_snapshot(error = TRUE, {
new_class("test", parent = parentS4)
new_class("test", parent = parentS4, package = NULL)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should now move out of the snapshot test to get something more strict.

})

it("can convert_up() an S4-derived S7 object to an S4 object ", {
on.exit(S4_remove_classes(c("ParentS4", "ChildS7")))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth considering extening local_S4_class() to handle more cases so we could skip the explicit cleanup steps (which are easy to forget)?

@t-kalinowski t-kalinowski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lawremi I took a cursory look and didn’t see any obvious issues. Could you please tag me again once you’ve had a chance to resolve Hadley’s comments? I’ll take a more thorough look then. That way I can avoid duplicating effort or leaving comments that might become stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mixing S4 and S7 inheritance hierarchies

3 participants