Implement vec_unstructure()
#2130
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Part of #2129
Foundational method that takes a vector that meets vctrs's newly written up native storage requirements, and strips away all extraneous attributes not natively handled by vctrs methods.
Not being used in place of
vec_data()quite yet, but that is the goal. We will then soft-deprecatevec_data()and start to move away from it in favor of this here in vctrs and in dplyr/tidyr.It will also be used in
vec_proxy()on the output of a user's proxy method. This ensures that:It also seems likely that there is room for
vctrs::vec_unstructure()andrlang::unstructure()vctrs::vec_unstructure()rules:namesnamesdimanddimnames[[1]](note, only row names)names,row.names, and aclassof"data.frame"rlang::unstructure()rules:namesnamesnamesdimanddimnames(note, alldimnames)Notable differences between the two:
dimnamesare kept invec_unstructure(), but all ofdimnamesare kept inunstructure(), because base R operations propagate all ofdimnamesvec_unstructure()but are treated like lists inunstructure()NULLis allowed inunstructure()but notvec_unstructure()environmentand all other types are allowed inunstructure()but notvec_unstructure(). Rationale for allowing them inunstructure()is that instructure()you can pass in an environment and add attributes to it, so there should be a way to remove them as well. But no attributes on an environment are ever "critical", so you just clear them.For practical usage of
rlang::unstructure():+, where you'd want to strip off the rray class but retain all ofdimnamesbefore delegating to base R's own+method, wheredimnamesare propagateddplyr:::dplyr_new_list()andtidyr:::tidyr_new_list(), where I often pass in a data frame and expect this to unstructure that into a named list with no extra attributesIt is quite fast, we might be able to get away without
vec_proxy_unsafe(), not sure yet.Notably using R's ALTREP wrapper types here to avoid a copy of large objects (since only attributes are being manipulated).
But proxy methods were already quite fast, so maybe not.
I imagine that in something like
vec_c()we would usevec_proxy()on theoutobject we create (because we want tovec_restore()at the end), but we'd usevec_proxy_unsafe()on all of the elements before copying them over (because we don't care about their extraneous attributes, we just want the C compatible form that we can copy from).