Skip to content

Placeholder syntax in parametrized queries #1397

Open
@nbenn

Description

@nbenn

Currently there is no way to determine what types of placeholders a backend supports for parametrized queries.

SQLite for example is quite flexible, supporting all of "?", "$i", "$name", ":name" (where "i" represents a parameter position and "name" a parameter name), while Postgres (I believe) only supports "?" and "$i" and cannot handle named parameters.

In what way this information could be communicated and whether it is worthwhile to do so is currently unclear to me. I'm hoping that this could be a place to have such a discussion.

Maybe I can start out explaining a bit why this became relevant tom me and then we see where this goes. It might be that we can sort out my specific issues without introducing such infrastructure at all. Of course this might be interesting to others nonetheless.

For context: I'm writing a high level wrapper in R that aligns the DBI interface with functionality exposed by adbcdrivermanager. As example, consider the following query:

library(adbcdrivermanager)

db <- adbc_database_init(adbcsqlite::adbcsqlite(), uri = ":memory:")
con <- adbc_connection_init(db)

write_adbc(datasets::swiss, con, "swiss", temporary = TRUE)

stmt <- adbc_statement_init(con)
adbc_statement_set_sql_query(
  stmt,
  "SELECT * from swiss WHERE Agriculture < $agr AND Education > $edu"
)
adbc_statement_prepare(stmt)

We have a "parameter schema" with parameter names

names(adbc_statement_get_parameter_schema(stmt)$children)
#> [1] "$agr" "$edu"

At the same time, I want my users to be able to pass parameter values by referring to more natural names such as "agr" and "edu". Something like

execute_query <- function(...) {

  params <- data.frame(...)
  adbc_statement_bind_stream(stmt, params)

  res <- nanoarrow::nanoarrow_allocate_array_stream()
  adbc_statement_execute_query(stmt, res)

  as.data.frame(res)
}

nrow(execute_query(agr = 30, edu = 10))
#> [1] 9

As things stand, it seems that for named parameters, names are not considered and only positions are relevant. This makes the current implementation of parameter binding (at least for SQLite) quite lenient.

nrow(execute_query(agr = 30, typo = 10))
#> [1] 9
nrow(execute_query(edu = 10, agr = 30))
#> [1] 1
nrow(execute_query(agr = 30, edu = "typo"))
#> [1] 0

If I want to properly support named parameters, I have to sort out ordering on my end, as well as potentially raising errors for name/type mismatches. To do this, I need to match names in the parameter schema with my "natural" names. And this is where having information on what kinds of placeholder patterns are supported by the back-end becomes relevant to me.

Cc @krlmlr @paleolimbot

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions