Description
TLDR
For deciding which hints to preserve during our lineage / column schema detection, we need to know if a given sql query is just filtering a subset of rows with all columns from a given table or doing something more advanced (subset of columns, joins, aggregates..). For the simple filtering we can keep all hints, for the more advanced queries we need to remove primary_key and unique and other hints, since we don't really know how those columns should be used.
The main task
We need a small sqlglot based util that can analyze a given query and decide wether this is a simple query or a complex query.
Examples for a table with three columns col1, col2, col3
select * from table
-> simple, only one table is accessed and all columns are there
select col1, col2 from table
-> complex as not all columns are there
select t1.col1 as blah, t1.col2 as blubb, t1.col3 from table as t
-> simple, only one table, all columns are there
select * from t1 join t2 on t1.id =t2.t1_id
-> complex, it's a join
select * from t1 where t1.id > 5 offset 5 limit 10
-> simple, only one table, all columns are there
select *, static_val = "123" from table
-> debatable, but I would say simple, only one table is accessed and all columns are there.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status