Skip to content

Different behavior of column remove for separate_*() and separate() #1587

Open
@Edgar-Zamora

Description

@Edgar-Zamora

Issue

While updating code to move from separate() to separate_*(), I noticed that the behavior of where the final placement of the column being separated differed between both. While not breaking changes, the behavior is not expected and might cause issues if column selection is based on position.

Proposed Change

I think the easiest non-breaking change would be to add to the documentation mentioning that the position for the column being separated is not preserved.

Example

  • In separate(), the column remains in the same position when remove=FALSE.
  • In separate*(), using cols_remove=FALSE places the column after the last column that was seperated.
library(tidyverse)
library(tidyr)
library(reprex)

df <- tibble(x = c('a_b', 'c_d', "e_f", 'g_h', 'i_j'),
             x2 = c('k_l', 'm_n', "o_p", 'q_r', 'x_y'))

# using separate
df |> 
  separate(x, into = c('xx1', 'xx2'), sep = '_', remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   a     b     k_l  
#> 2 c_d   c     d     m_n  
#> 3 e_f   e     f     o_p  
#> 4 g_h   g     h     q_r  
#> 5 i_j   i     j     x_y


df |> 
  separate(x2, into = c('xx1', 'xx2'), sep = '_', remove = FALSE)
#> # A tibble: 5 × 4
#>   x     x2    xx1   xx2  
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k_l   k     l    
#> 2 c_d   m_n   m     n    
#> 3 e_f   o_p   o     p    
#> 4 g_h   q_r   q     r    
#> 5 i_j   x_y   x     y


# delim
df |> 
  separate_wider_delim(x2, names = c('xx1', 'xx2'), delim = '_', cols_remove =  FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

# position
df |> 
  separate_wider_position(x2, widths = c(xx1 = 1, 1, xx2 = 1), cols_remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

# regex
df |> 
  separate_wider_regex(x2, patterns = c(xx1 = ".", "_", xx2 = "."), cols_remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

Created on 2025-01-23 with reprex v2.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions