Description
Describe the bug
When unparsing a logical plan containing a LeftAnti Join
or a LeftSemi Join
operator with the unparser::dialect::PostgreSqlDialect
, the resulting unparsed SQL contains LEFT ANTI JOIN
or LEFT SEMI JOIN
, respectively, which are not supported by Postgres. The same thing is true for other dialects such as unparser::dialect::MySqlDialect
and unparser::dialect::SqliteDialect
.
This causes issues when attempting to run federated queries in Datafusion over systems such as Postgres (namely, in the datafusion-table-providers repo).
I'm not sure if this is a bug or if the responsibility to generate valid SQL in a specific dialect should be delegated downstream. However, since there appears to be a previous PR with a similar issue (#10625), I leave this one here just in case.
To Reproduce
use datafusion::{
error::Result,
prelude::SessionContext,
sql::unparser::{self, Unparser},
};
#[tokio::main]
async fn main() -> Result<()> {
let ctx = SessionContext::new();
ctx.sql("create table t1 (c int)").await?.collect().await?;
ctx.sql("create table t2 (c int)").await?.collect().await?;
let df = ctx
.sql("select * from t1 where c not in (select c from t2)")
.await?;
let plan = df.into_optimized_plan()?; // optimizing the plan will introduce the LeftAnti Join
println!("{}", plan);
let sql = Unparser::new(&unparser::dialect::PostgreSqlDialect {}).plan_to_sql(&plan)?;
println!("{}", sql);
Ok(())
}
Which returns the (invalid in Postgres) query:
SELECT * FROM "t1" LEFT ANTI JOIN "t2" AS "__correlated_sq_1" ON "t1"."c" = "__correlated_sq_1"."c"
Expected behavior
Generate an equivalent valid query, such as:
SELECT * FROM t1 WHERE c NOT IN (SELECT c FROM t2)
Additional context
datafusion = "45.0.0"