-
Notifications
You must be signed in to change notification settings - Fork 3.3k
fix(ingestion/schema-resolver): add enabler for sql parsing for more than 3 parts and implement in dremio #15733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…than 3 parts and implement in dremio
|
Linear: ING-1303 |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| database: Optional[str] = None | ||
| db_schema: Optional[str] = None | ||
| table: str | ||
| parts: Optional[tuple] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
databse, db_schema and table are self-explained
parts field would require some docs about the purpose
💡 additionally, we transform some comments into actual code to emphasize the identity constraint
eg
@property
def identity(self) -> tuple:
return (self.database, self.db_schema, self.table) # parts excluded on purpose
|
|
||
| # Store parts as string tuple for connector-specific handling | ||
| parts_tuple = None | ||
| if hasattr(table, "parts") and table.parts: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the hasattr? in which cases sqlglot.exp.Table have/miss this parts field?
similar with the gettattr below, quite confusing
maybe AI agent being excessively/unnecessarily conservative?
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Dremio uses 'dremio' as the default database name in all SQL contexts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where this constraint comes from? us, sqlglot, dremio itself?
| table_name_parts = [DREMIO_DATABASE_NAME] + list(table.parts) | ||
| table_name = ".".join(filter(None, table_name_parts)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to have a method in _TableName providing this or part of this logic?
# _TableName
@property
def name_with_parts(self): # I'm sure you can find a better name
return ".".join(filter(None, list(self.parts)))
# here
table_name = DREMIO_DATABASE_NAME + "." + table.name_with_parts
And I also think part of this could also be pushed to the _TableName
table_name_parts = [
DREMIO_DATABASE_NAME,
table.database,
table.db_schema,
table.table,
]
else:
table_name_parts = [
table.database or DREMIO_DATABASE_NAME,
table.db_schema,
table.table,
]
| self.database == other.database | ||
| and self.db_schema == other.db_schema | ||
| and self.table == other.table | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bart also found this
BLOCKERS - Must Fix Before Merge:*
- qualified() method loses the parts field (line 74-78 in _models.py)
- When creating a new _TableName instance with defaults applied, the parts field isn't preserved
- This breaks multi-part table handling - the entire feature falls apart here
- Fix: Add parts=self.parts when returning the new instance
sgomezvillamor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
Main concern is: given we have new fields in _TableName let's try to keep the logic around that encapsulated in the _TableName class.
| database: Optional[str] = None | ||
| db_schema: Optional[str] = None | ||
| table: str | ||
| parts: Optional[Tuple[str, ...]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please, add docs around this field
sgomezvillamor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving to unblock
No description provided.