Skip to content

Conversation

sgomezvillamor
Copy link
Contributor

@sgomezvillamor sgomezvillamor commented Oct 10, 2025

Summary

  • Add optional is_aws_rds configuration flag to explicitly specify environment type
  • Replace unreliable permission-based RDS detection with server name analysis using SELECT @@servername
  • Implement comprehensive logging for detection decisions

Problem

The existing RDS detection mechanism was flawed because it relied on permission checks rather than actual environment detection:

Permission-based, not environment-based: The ability to query msdb.dbo.sysjobs depends on user permissions, not the environment type. A restricted user on on-premises SQL Server would also fail this check.

False positives: On-premises SQL Server with:

  • Non-sysadmin users
  • Restricted database permissions
  • Security policies blocking msdb access

Would all incorrectly appear as "RDS"

False negatives: An RDS user with appropriate permissions might actually succeed in querying system objects

Solution

  1. Optional explicit configuration: Added is_aws_rds boolean flag that takes precedence when provided
  2. Server name detection: Uses SELECT @@servername to check for AWS indicators: amazon, amzn, amaz, ec2, rds.amazonaws.com
  3. Robust error handling: Falls back to assuming non-RDS if detection fails
  4. Comprehensive logging: Info-level logging for final decisions, debug/warning for intermediate steps

🤖 Generated with Claude Code

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Oct 10, 2025
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 45.23810% with 23 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...n/src/datahub/ingestion/source/sql/mssql/source.py 45.23% 23 Missing ⚠️

❌ Your patch check has failed because the patch coverage (45.23%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

📢 Thoughts on this report? Let us know!

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Oct 10, 2025
- Fix failing tests to match server name-based detection
- Add comprehensive test coverage for explicit configuration
- Add parametrized tests for various AWS server name patterns
- Test error handling and edge cases

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant