Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Load cached DB metadata as DatasourceName and add catalog to schema_list cache key #31948

Merged
merged 1 commit into from
Jan 21, 2025

Conversation

Vitor-Avila
Copy link
Contributor

SUMMARY

The get_all_table_names_in_schema and get_all_view_names_in_schema methods might return data from cache (if force isn't set to True). Metadata is serialized before getting cached, so currently if cached results are returned, the logic would throw an error as it would try to access these as DatasourceName named tuples directly.

This PR changes the logic so that the result of these methods are then loaded as DatasourceNames, so that cached results can be used properly.

I also added the catalog name to the schema_list cache key.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before
An error is displayed when trying to load cached table list:
image

After
List loads properly.

TESTING INSTRUCTIONS

  1. Create a DB connection.
  2. Set a cache timeout for table metadata.
  3. Load tables from a schema.
  4. Refresh the browser.
  5. The list should load properly.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added data:databases Related to database configurations and connections infra:caching Infra setup and configuration related to caching labels Jan 21, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/commands/database/tables.py
superset/models/core.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

@betodealmeida
Copy link
Member

betodealmeida commented Jan 21, 2025

Ah, I see... the method returns set[DatasourceName], but that gets serialized to a list of lists, so when we have a cache hit we have the wrong type.

Back in 2022 I added a warning to the memoized_func cache decorator:

Note: this decorator should be used only with functions that return primitives, otherwise the deserialization might not work correctly.

But in 2024 I forgot about it, and modified get_all_table_names_in_schema (and views) to return a set[DatasourceName].... 🤦🏼

I'll look into modifying memoize_func so that it raises an error when non-primitive types are used, unless the user specifies a flag.

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@Vitor-Avila Vitor-Avila merged commit 43a97f8 into master Jan 21, 2025
43 of 44 checks passed
@Vitor-Avila Vitor-Avila deleted the fix/load-cached-table-metadata branch January 21, 2025 19:36
hainenber pushed a commit to hainenber/superset that referenced this pull request Jan 22, 2025
@michael-s-molina
Copy link
Member

@Vitor-Avila It would be interesting to open a follow-up to add tests for this change. @sadpandajoe mentioned that any contributor could remove the datasource reference without knowing the impact.

@Vitor-Avila
Copy link
Contributor Author

@michael-s-molina @sadpandajoe that's a good point. I've added some additional tests/updated some existing tests and also added a comment to the code on this follow up PR: #32054

tmsjordan pushed a commit to tmsdevelopment/superset that referenced this pull request Feb 1, 2025
tmsjordan pushed a commit to tmsdevelopment/superset that referenced this pull request Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:databases Related to database configurations and connections infra:caching Infra setup and configuration related to caching size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants