Support S7 extension of S4 classes and vice-versa#659
Conversation
87b33e2 to
5e5d36d
Compare
5e5d36d to
3650492
Compare
…cing to an S3 class, and use methods::extends() to capture the entire class vector, not just the first class
…@() behavior, which needs to change in base R.
…S4_register() can now communicate any S7 class structure through the S4class= argument.
…t initialize() except uses the S7 property setting path.
… parent object construction in constructor.
…ot really slots, they're properties; drop the initialize() method for the same reason, and instead use setValidity() ensure S4 code properly checks for validity. This also catches the odd case where new() is used on an old class, like with new(class(x)), creating an object that should probably not exist.
… the methods package, will crash and burn on class vectors of length > 1
…m S7), we need to use R_do_slot_assign so that a NULL value does not delete the S4 slot.
… stored in the .S3Class slot.
…S4 object is just that, even if it inherits from an S7 class and thus carries an S7_class slot.
…ility with S4; arguably a generally better approach since it avoids adding and deleting attributes based on whether the value is NULL or not.
…lidate() as well as validObject()
…ults. We evaluate these if they are language objects, so S7 classes extending or being extended by S4 classes need to ensure that evaluation can happen at build time. Also mark it VIRTUAL because it should never be constructed directly.
… an S4 instance of that class (old classes are virtual normally but not in this case since we pass an S4Class prototype)
…presentation along the old class chain
…ic is internal and an element of the signature has an S4 ancestor. This is needed because internal generics will favor S4 methods on S4 objects, so there is potential for inheriting overrides.
…ary S4 classes instead of old classes, because an old class implies an S3 instance of the object can exist, ie, there can be an S3Part().
… it to get stripped during S4 upcasting, breaking the S7 object
…in principle, there should only be one ::S4Slots shim for each S4 derivative of an S7 class
3650492 to
eb1092b
Compare
hadley
left a comment
There was a problem hiding this comment.
I assume you know infinitely more about S4 than me 😄 so I focussed on docs and style.
Also need an update to NEWS.md?
| @@ -0,0 +1,63 @@ | |||
| `new_class()` can use an S4 class as its parent. The S4 class can be supplied | |||
There was a problem hiding this comment.
I think this would probably be better off in vignettes/compatibility.Rmd. We're not super consistent on where we put these docs, but I think longer form stuff is just easier to read in vignettes, and then it's next to the S3 equivalent.
| methods::is(Child(x = 1, y = "a"), "Parent") | ||
| ``` | ||
|
|
||
| Things that work reasonably: |
There was a problem hiding this comment.
| Things that work reasonably: | |
| Things that work reasonably well: |
| old-class object, not as an S4-bit object, so `isS4(child)` is `FALSE`. | ||
| This avoids advertising full S4 object invariants that the object does not | ||
| satisfy. | ||
| * The S3 class vector is not scalar. S4 code that needs one primary class name |
There was a problem hiding this comment.
How common is this use of scalar class()?
| `S7_object` through the old-class graph while lacking the `S7_class` slot or | ||
| attribute that makes it an S7 object. Code that updates an existing object | ||
| should prefer `methods::initialize(x, ...)`. | ||
| * S4 methods that call `methods::slot<-()` can bypass S7 property setters and |
There was a problem hiding this comment.
This is consistent with the design of S4 right, i.e. where you are generally responsible for validation after directly modifying a slot?
| `S4_register_contains()` is for the opposite direction: use it when an S4 class | ||
| needs to extend an S7 class with `methods::setClass(contains = )`. It registers | ||
| the S7 class if needed, then creates and returns an S4 shim class whose name | ||
| ends in `::S4Slots`. The shim: |
There was a problem hiding this comment.
| `S4_register_contains()` is for the opposite direction: use it when an S4 class | |
| needs to extend an S7 class with `methods::setClass(contains = )`. It registers | |
| the S7 class if needed, then creates and returns an S4 shim class whose name | |
| ends in `::S4Slots`. The shim: | |
| `S4_register_contains()` is for inheritance: use it when an S4 class | |
| should be allowed to extend an S7 class with `methods::setClass(contains = )`. It registers | |
| the S7 class if needed, then creates and returns an S4 shim class whose name | |
| ends in `::S4Slots`. The shim: |
?
| # Accept a bare list of length 1 too, for symmetry with multi-dispatch | ||
| # generics where a list is required (#555). | ||
| if (is.list(signature) && !is.object(signature) && length(signature) == 1) { | ||
| if (is_multi_arg_signature(signature) && length(signature) == 1) { |
There was a problem hiding this comment.
I think the length check should go in the helper too?
| } | ||
| } | ||
|
|
||
| is_multi_arg_signature <- function(signature) { |
| #' S4Foo_S4 <- S4_register_contains(S4Foo) | ||
| #' methods::setClass("S4Child", contains = S4Foo_S4) | ||
| S4_register <- function(class, env = parent.frame()) { | ||
| if (is_class(class)) { |
There was a problem hiding this comment.
I think this would be a bit clearer if we moved all the early returns first?
| parentS4 <- methods::setClass("parentS4", slots = c(x = "numeric")) | ||
| expect_snapshot(error = TRUE, { | ||
| new_class("test", parent = parentS4) | ||
| new_class("test", parent = parentS4, package = NULL) |
There was a problem hiding this comment.
I think this should now move out of the snapshot test to get something more strict.
| }) | ||
|
|
||
| it("can convert_up() an S4-derived S7 object to an S4 object ", { | ||
| on.exit(S4_remove_classes(c("ParentS4", "ChildS7"))) |
There was a problem hiding this comment.
Might be worth considering extening local_S4_class() to handle more cases so we could skip the explicit cleanup steps (which are easy to forget)?
t-kalinowski
left a comment
There was a problem hiding this comment.
@lawremi I took a cursory look and didn’t see any obvious issues. Could you please tag me again once you’ve had a chance to resolve Hadley’s comments? I’ll take a more thorough look then. That way I can avoid duplicating effort or leaving comments that might become stale.
Fixes #456.
First iteration on enabling an S7 class to extend an S4 class and vice-versa.
The concrete, fail-fast goal is to enable SummarizedExperiment, a complex, central package in Bioconductor, to be rewritten in S7, based on S4 classes defined lower in the stack, like those from S4Vectors and IRanges, while not breaking compatibility with dependent packages, like SingleCellExperiment, via S4 shims extending the S7 classes. SingleCellExperiment is complex and deeply dependent on SummarizedExperiment, so if it continues working, it is a very good indicator.
What has been achieved:
class1()helper to be explicit when it wants the scalar name of the object's class and to ensure that it is robust to S3-style class vectors. Also improved thesetValidity2()helper so that validity functions are enclosed in the S4Vectors namespace, enabling S4 class objects to remain identical across lazy loading. This is important since S4 class objects end up in the S7 class object, which is carried by S7 objects, and we want them to pass identity checks across packages.For S7 classes extending S4 classes, the current approach is to instantiate S7 objects, meaning that the S4 bit is not set. The justification is that there are S4 object invariants that an S7 object cannot satisfy, and vice versa. The big one being the return value of
class(). Instead we have an S7 shim that does a reasonable job of mimicking an S4 object, including slot access, validity checking, and initialization, but notclass()compatibility. See the evolving man page insert for more details.To extend an S7 class using
setClass(), it is necessary to first register an S4 shim usingS4_register_contains(). The shim is an ordinary S4 class that derives from the old class representation of the S7 class, which is automatically created byS4_register()if needed. It defines the S7 properties, as well as theS7_classattribute, as slots. This is necessary so that the automatically defined upcast coercions, used for example invalidObject(), preserve the attributes S7 requires. Any instances of an S7-derived S4 class are proper S4 objects, with the bit set and a scalar class vector.Besides adding and changing hundreds of lines of logic around S4 classes, significant changes to support this in S7 include:
NULLslot sentinels. Until now, setting an S7 property toNULLremoved the attribute, and S4 does not like it when its slots disappear from the attributes.prop.cneed to be robust to scalar class vectors, using@.S3Classinstead.dimnames<-()generic is an example.@<-.S7_object()needs to support slots defined on S4 children that are not S7 properties, because S4 classes that extendS7_objectwill dispatch to the method, even when the S4 bit is set.Resolving the two incompatible
class()return values, scalar vs. non-scalar, is a larger issue. My proposal is that we exportmethods:::.class1()asmethods::class1()and encourage all S4 code to useclass1()to make the expectation of a scalar return value explicit. Practically, we could update the core of Bioconductor to adhere to that policy and thus enable new Bioconductor packages to extend the infrastructure with S7.