-
Notifications
You must be signed in to change notification settings - Fork 116
Open
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior
Description
When using user_na = TRUE in read_sav, the function preserves user-defined missing values as expected. However, when combining these data frames in R, it causes issues such as Error in if (!any(lossy)) { : missing value where TRUE/FALSE needed. I do not know the reason but this happens for public use files of OECD's PIAAC datasets. My guess would be that the labels too long. I see a similar issue here #427 but I do not see any resolution. See the code below.
library(haven)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
# Define file paths for the downloaded SPSS files
file1 <- "prgautp1.sav" # Replace with the path if you downloaded locally
file2 <- "prgbelp1.sav" # Replace with the path if you downloaded locally
# Download the SPSS files (if not already done)
download.file("https://webfs.oecd.org/piaac/puf-data/SPSS/prgautp1.sav", file1, mode = "wb")
download.file("https://webfs.oecd.org/piaac/puf-data/SPSS/prgbelp1.sav", file2, mode = "wb")
# Read the SPSS files with user_na = TRUE
df1 <- read_sav(file1, user_na = TRUE)
df2 <- read_sav(file2, user_na = TRUE)
# Check the structure of the data frames to understand the data types and NAs
#str(df1)
#str(df2)
# Attempt to combine using dplyr::bind_rows()
bind_rows(df1, df2)
#> Warning: `..1$D_Q18a_T` and `..2$D_Q18a_T` have conflicting value labels.
#> ℹ Labels for these values will be taken from `..1$D_Q18a_T`.
#> ✖ Values: 6
#> Error in if (!any(lossy)) {: missing value where TRUE/FALSE needed# Attempt to combine using base rbind()
rbind(df1, df2)
#> Error in if (!any(lossy)) {: missing value where TRUE/FALSE neededCreated on 2024-09-19 with reprex v2.1.1
Session info
sessionInfo()
#> R version 4.4.1 (2024-06-14 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: Europe/Berlin
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dplyr_1.1.4 haven_2.5.4
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.5 cli_3.6.3 knitr_1.47 rlang_1.1.4
#> [5] xfun_0.45 forcats_1.0.0 generics_0.1.3 glue_1.7.0
#> [9] htmltools_0.5.8.1 hms_1.1.3 fansi_1.0.6 rmarkdown_2.27
#> [13] evaluate_0.24.0 tibble_3.2.1 tzdb_0.4.0 fastmap_1.2.0
#> [17] yaml_2.3.8 lifecycle_1.0.4 compiler_4.4.1 fs_1.6.4
#> [21] pkgconfig_2.0.3 rstudioapi_0.16.0 digest_0.6.36 R6_2.5.1
#> [25] readr_2.1.5 tidyselect_1.2.1 reprex_2.1.1 utf8_1.2.4
#> [29] pillar_1.9.0 magrittr_2.0.3 tools_4.4.1 withr_3.0.0Metadata
Metadata
Assignees
Labels
bugan unexpected problem or unintended behavioran unexpected problem or unintended behavior