Skip to content

many-to-many tables not supported by dm_examine_constraints #1365

@smgogarten

Description

@smgogarten

I have a data model that includes a many-to-many table, which I was able to validate with earlier versions of the dm package. Now, using dm_examine_constraints on the same model produces a validation error. It appears that the function is defining a key type "unique key", even when I have not set a primary key in the model. The documentation says that the output of dm_examine_constraints includes a vector kind with values "PK" or "FK", but it actually creates additional keys with kind "UK" not set by the user.

Here is a reproducible example:

sample <- tibble(
    sample_id = c("sample1", "sample2", "sample3"),
    property = c("A", "B", "C")
)
sample_set <- tibble(
    sample_set_id = c("set1", "set1", "set2", "set2"),
    sample_id = c("sample1", "sample2", "sample2", "sample3")
)
dataset <- tibble(
    dataset_id = c("dataset1", "dataset2"),
    sample_set_id = c("set1", "set2")
)

model <- as_dm(list(sample = sample,
          sample_set = sample_set,
          dataset = dataset))
model <- dm_add_fk(model,
                   table = "sample_set",
                   columns = "sample_id",
                   ref_table = "sample",
                   ref_columns = "sample_id")
model <- dm_add_fk(model,
                   table = "dataset",
                   columns = "sample_set_id",
                   ref_table = "sample_set",
                   ref_columns = "sample_set_id")

(chk <- dm_examine_constraints(model))
! Unsatisfied constraints:
• Table `sample_set`: unique key `sample_set_id`: has duplicate values: set1 (2), set2 (2)

str(chk)
dm_examine_constraints [4 × 6] (S3: dm_examine_constraints/tbl_df/tbl/data.frame)
 $ table    : chr [1:4] "sample_set" "sample" "dataset" "sample_set"
 $ kind     : chr [1:4] "UK" "UK" "FK" "FK"
 $ columns  : keys [1:4] 
  ..$ : chr "sample_set_id"
  ..$ : chr "sample_id"
  ..$ : chr "sample_set_id"
  ..$ : chr "sample_id"
  ..@ ptype: chr(0) 
 $ ref_table: chr [1:4] NA NA "sample_set" "sample"
 $ is_key   : logi [1:4] FALSE TRUE TRUE TRUE
 $ problem  : chr [1:4] "has duplicate values: set1 (2), set2 (2)" "" "" ""

 sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dm_1.0.0

loaded via a namespace (and not attached):
 [1] igraph_1.2.11     magrittr_2.0.2    hms_1.1.1         progress_1.2.2   
 [5] tidyselect_1.1.1  R6_2.5.1          rlang_1.0.4       fastmap_1.1.0    
 [9] fansi_1.0.2       dplyr_1.0.9       tools_4.1.2       utf8_1.2.2       
[13] cli_3.2.0         DBI_1.1.2         ellipsis_0.3.2    assertthat_0.2.1 
[17] tibble_3.1.6      lifecycle_1.0.1   crayon_1.5.0      purrr_0.3.4      
[21] tidyr_1.2.0       vctrs_0.4.1       memoise_2.0.1     glue_1.6.1       
[25] cachem_1.0.6      compiler_4.1.2    pillar_1.7.0      prettyunits_1.1.1
[29] generics_0.1.2    backports_1.4.1   pkgconfig_2.0.3  

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions