Skip to content

Remove Azure resources from gnomad/resources/config.py, gnomad/resources/resource_utils.py, tests/resources/test_resource_utils.py, and docs/resource_sources.rst#792

Merged
ch-kr merged 6 commits intomainfrom
kc/remove_azure_resources
Jul 22, 2025
Merged

Remove Azure resources from gnomad/resources/config.py, gnomad/resources/resource_utils.py, tests/resources/test_resource_utils.py, and docs/resource_sources.rst#792
ch-kr merged 6 commits intomainfrom
kc/remove_azure_resources

Conversation

@ch-kr
Copy link
Copy Markdown
Contributor

@ch-kr ch-kr commented Jul 22, 2025

This PR removes support for Azure Open Datasets-based gnomAD public resources. Access to gnomAD data in Azure Open Datasets will be deprecated in August.

Breaking changes in gnomad/resources/config.py:

  • Removed AZURE_OPEN_DATASETS constant
  • Removed hdinsight from in default_resource_sources_by_provider within get_default_public_resource_source function

@ch-kr ch-kr requested a review from a team as a code owner July 22, 2025 13:50
@ch-kr ch-kr requested review from Copilot and removed request for a team July 22, 2025 13:50
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes support for Azure Open Datasets-based gnomAD public resources as access to gnomAD data in Azure Open Datasets will be deprecated in August. The changes involve removing Azure-specific constants, configuration mappings, path generation logic, and associated tests.

  • Removed AZURE_OPEN_DATASETS constant and hdinsight provider mapping from configuration
  • Eliminated Azure-specific path generation logic in resource utilities
  • Updated tests to remove Azure-related test cases and parameters

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
gnomad/resources/config.py Removes Azure Open Datasets enum value and hdinsight provider mapping
gnomad/resources/resource_utils.py Removes Azure path generation logic for wasbs:// URLs
tests/resources/test_resource_utils.py Removes Azure-related test cases and updates test parameters
docs/resource_sources.rst Removes documentation reference to Azure HDInsight cluster usage
Comments suppressed due to low confidence (1)

tests/resources/test_resource_utils.py:194

  • This test was previously verifying that environment variables override cloud spark provider detection when hdinsight was detected. Now that hdinsight support is removed, the test should verify this behavior with a different provider or the test case should be updated to reflect the new expected behavior more clearly.
                return_value="dataproc",

@ch-kr ch-kr self-assigned this Jul 22, 2025
@ch-kr ch-kr requested a review from mike-w-wilson July 22, 2025 15:26
("dataproc", GnomadPublicResourceSource.GOOGLE_CLOUD_PUBLIC_DATASETS),
("hdinsight", GnomadPublicResourceSource.AZURE_OPEN_DATASETS),
("unknown", GnomadPublicResourceSource.GOOGLE_CLOUD_PUBLIC_DATASETS),
(None, GnomadPublicResourceSource.GOOGLE_CLOUD_PUBLIC_DATASETS),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why the None is being dropped too since it looks like we would default to google in this case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I thought this was something Cursor added -- will add back

@ch-kr ch-kr requested a review from mike-w-wilson July 22, 2025 17:05
Copy link
Copy Markdown
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@ch-kr ch-kr merged commit 8d9be56 into main Jul 22, 2025
6 checks passed
@ch-kr ch-kr deleted the kc/remove_azure_resources branch July 22, 2025 17:08
@mike-w-wilson mike-w-wilson self-assigned this Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants