-
-
Notifications
You must be signed in to change notification settings - Fork 1
Token renewal for PostgreSQL #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
MiguelElGallo
commented
Sep 27, 2025
- Renew token automatically
- get token using AzureDefaultCredential()
- Improved Token caching
- Remove problematic _ensure_arrow_table method and its dependencies - Restore direct .arrow() calls without additional materialization wrappers - Remove collections.abc.Iterable and typing.Any imports - Revert all DuckDB 1.4.0 compatibility changes that caused issues This rollback restores the duckdb_backend.py file to commit 5910739, which was the last known working state before the problems started.
…scripts - Updated client.sh to enhance usability and provide clearer connection setup. - Removed client_fixed.py, demo.py, and run_client.py as they are no longer needed. - Improved error handling and configuration loading in client.sh. - Added usage examples and connection configuration display in client.sh. - Simplified command structure and removed unnecessary commands from the client.
- Implemented `execute_sql_file` method in `client.py` to execute SQL commands from a specified file. - Updated command-line interface in `client.sh` to include an option for executing SQL files. - Enhanced error handling and logging during SQL file execution. - Added comprehensive test scripts for schema detection and SQL operations. - Created multiple SQL test files to validate various functionalities including DDL, DML, and analytical queries. - Introduced a new demo script to showcase write and read operations in the DuckLake database.
…s; clean up SQL test files and verification scripts for improved clarity and maintainability.
- Implemented `run_insertsmall_simple.sh` for batch inserts with basic monitoring and error detection. - Introduced `run_quick_schema_test.sh` for rapid validation of schema detection. - Enhanced `client_debug.py` with memory monitoring, error handling, and connection management. - Created SQL test scripts for listing and merging Ducklake files (`01_lakefiles.sql` and `02_lakemergfiles.sql`).
…rsion Co-authored-by: MiguelElGallo <[email protected]>
…reate status report Co-authored-by: MiguelElGallo <[email protected]>
…483-8a89-e37b16bf98ed Prepare test suite for release: Fix all failing tests and improve coverage
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements automatic token renewal for PostgreSQL with Azure authentication and includes comprehensive improvements to the token management system.
Key changes:
- Automatic token renewal for PostgreSQL using DefaultAzureCredential()
- Enhanced token caching and refresh mechanisms
- Clean up of obsolete Azure CLI authentication tests
- Comprehensive test suite improvements and fixes
Reviewed Changes
Copilot reviewed 40 out of 42 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
src/mpzsql/cli.py | Added AzureCredentialManager class for token renewal and caching |
src/mpzsql/backends/duckdb_backend.py | Fixed RecordBatchReader handling and schema detection |
tests/test_duckdb_backend_improvements.py | Added new tests for backend improvements |
tests/test_duckdb_backend_arrow_utilities.py | Removed obsolete utility tests |
tests/test_cli.py | Removed deprecated Azure CLI authentication tests |
pyproject.toml | Added psutil dependency |
TEST_STATUS_REPORT.md | Comprehensive test improvement documentation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
def _clean_sql_query(self, query: str) -> str: | ||
"""Clean SQL query by removing comments and normalizing whitespace.""" | ||
lines = [] | ||
for line in query.split("\n"): | ||
# Remove single-line comments (-- comments) | ||
if "--" in line: | ||
line = line[: line.index("--")] | ||
# Strip whitespace and skip empty lines | ||
line = line.strip() | ||
if line: | ||
lines.append(line) |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment removal logic may incorrectly remove '--' that appears within string literals. For example, the query SELECT 'hello--world'
would be incorrectly truncated to SELECT 'hello
. Consider using a proper SQL parser or implementing string literal detection to avoid this issue.
Copilot uses AI. Check for mistakes.
def __init__(self): | ||
self._credential = None | ||
self._cached_token = None | ||
self._token_expires_at = None | ||
self._postgresql_scope = "https://ossrdbms-aad.database.windows.net/.default" | ||
self._duckdb_connection = None | ||
self._server_config = None |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AzureCredentialManager class doesn't implement thread safety. If accessed from multiple threads simultaneously, race conditions could occur when updating cached tokens, connection objects, or configuration. Consider adding locking mechanisms or documenting that this class should only be used from a single thread.
Copilot uses AI. Check for mistakes.
Raises: | ||
Exception: If token acquisition fails |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method raises a generic 'Exception' which doesn't provide specific error information for callers. Consider creating specific exception types or at least documenting the specific exceptions that can be raised (e.g., CredentialUnavailableError, ClientAuthenticationError) to help callers handle different error scenarios appropriately.
Raises: | |
Exception: If token acquisition fails | |
Copilot uses AI. Check for mistakes.
"SELECT * FROM duckdb_prepared_statements()" | ||
).fetchall() | ||
|
||
except Exception as param_error: | ||
duckdb_log.error( | ||
f"Query-based schema detection failed: {param_error}" | ||
) | ||
# Try the PREPARE approach as fallback | ||
try: | ||
# Replace parameters with NULL for PREPARE | ||
schema_query = query | ||
our_stmt = None | ||
for row in result: | ||
if row[0] == stmt_name: | ||
our_stmt = row | ||
break |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Querying all prepared statements with SELECT *
when only looking for a specific statement by name is inefficient. Consider filtering the query with a WHERE clause: SELECT * FROM duckdb_prepared_statements() WHERE name = ?
to reduce unnecessary data transfer and processing.
Copilot uses AI. Check for mistakes.
PORT {port}, | ||
DATABASE '{escape_sql_string(database or "postgres")}', | ||
USER '{escape_sql_string(user)}', | ||
PASSWORD '{password}' |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The password (Azure token) is not escaped when creating the SQL secret, while other parameters are. This inconsistency could lead to SQL injection if the token contains special characters like single quotes. Either escape the password consistently or document why Azure tokens are safe from SQL injection.
PASSWORD '{password}' | |
PASSWORD '{escape_sql_string(password)}' |
Copilot uses AI. Check for mistakes.
Can you add more test to fix this: ❌ Patch coverage is ❌ Patch coverage is 49.00000% with 204 lines in your changes missing coverage. Please review. Files with missing lines Patch % Lines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 40 out of 42 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
src/mpzsql/cli.py:1
- Consider consolidating the Azure credential imports to avoid confusion. The async version is only used once, so you could import it directly at the point of use or use a more descriptive alias.
"""CLI interface for MPZSQL server using typer.
src/mpzsql/cli.py:1
- The password is inserted directly into the SQL string without proper escaping. While this is for Azure tokens which should be safe, consider using parameterized queries or a proper SQL builder to prevent potential injection issues.
"""CLI interface for MPZSQL server using typer.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
raise SystemExit( | ||
f"FATAL: Azure container verification failed: {e}" | ||
) from e |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message could be more helpful by including specific troubleshooting steps or configuration details about the Azure container that failed.
Copilot uses AI. Check for mistakes.
try: | ||
result_table = self._ensure_arrow_table( | ||
self.connection.execute(query, params).arrow() | ||
) | ||
result = self.connection.execute(query, params).arrow() | ||
# Convert RecordBatchReader to Table if needed | ||
if isinstance(result, pa.RecordBatchReader): | ||
result_table = result.read_all() | ||
else: | ||
result_table = result |
Copilot
AI
Sep 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RecordBatchReader to Table conversion pattern is repeated multiple times throughout the file. Consider extracting this into a helper method like _ensure_arrow_table()
to reduce code duplication and improve maintainability.
Copilot uses AI. Check for mistakes.
Can you add more tests to fix this: ❌ Patch coverage is 49.00000% with 204 lines in your changes missing coverage. Please review. Files with missing lines | Patch % | Lines -- | -- | -- src/mpzsql/cli.py | 46.66% | 144 Missing Files with missing lines Patch % Lines |