Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 7, 2025

Migration Plan: Azure Pipelines to GitHub Actions

  • Create .github/workflows directory structure
  • Create GitHub Actions workflow file that replicates Azure Pipelines functionality
    • Configure triggers for PRs on master and feature/* branches
    • Set up Python version matrix (3.10, 3.11, 3.12, 3.13)
    • Configure dependency installation (pip, poetry, spacy models)
    • Configure pytest execution with --runslow flag
    • Add caching for Poetry dependencies to improve performance
    • Add explicit permissions block for security best practices
    • Add disk space optimization to prevent storage failures
  • Validate the workflow configuration
  • Run final security scan
  • Fix incorrect SpacyRecognizer import in test file
  • Remove old azure-pipelines.yml file
  • Migration complete!

Storage Optimizations

To prevent "not enough storage on device" failures when installing large NER dependencies (Flair, Stanza, HuggingFace pipelines) across multiple Python versions, the workflow includes:

  • Pre-test cleanup: Removes unused pre-installed tools (.NET SDK, Android SDK, GHC, CodeQL) to free up 10-15GB of disk space
  • Post-test cleanup: Clears pip and poetry caches after each test run to prevent accumulation across matrix jobs
  • Disk monitoring: Tracks available storage with df -h for debugging

These optimizations ensure reliable CI execution despite the heavyweight dependencies required for comprehensive PII detection model testing.

Original prompt

I'd like to migrate the azure pipelines CI pipeline to a Github action. It should do the same process only on GH


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Migrate CI pipeline from Azure Pipelines to GitHub Actions Migrate CI pipeline from Azure Pipelines to GitHub Actions Dec 7, 2025
Copilot AI requested a review from omri374 December 7, 2025 20:30
@omri374
Copy link
Collaborator

omri374 commented Dec 7, 2025

@copilot fix the wrong import for spacy_recognizer and remove the old azure pipelines yaml

Copy link
Contributor Author

Copilot AI commented Dec 7, 2025

@copilot fix the wrong import for spacy_recognizer and remove the old azure pipelines yaml

Fixed in commit 85938eb:

  • Corrected SpacyRecognizer import from presidio_analyzer.predefined_recognizers.spacy_recognizer to presidio_analyzer.predefined_recognizers
  • Removed azure-pipelines.yml

@omri374 omri374 marked this pull request as ready for review December 8, 2025 05:08
@omri374
Copy link
Collaborator

omri374 commented Dec 8, 2025

@copilot the action sporadically fails due to not enough storage on device. Is there a way to reduce the storage?

Copy link
Contributor Author

Copilot AI commented Dec 8, 2025

@copilot the action sporadically fails due to not enough storage on device. Is there a way to reduce the storage?

Fixed in commit 09ed59e. Added disk space optimizations:

  1. Pre-cleanup: Removes unused tools before tests (.NET, Android SDK, GHC, CodeQL) - frees ~10-15GB
  2. Post-cleanup: Clears pip and poetry caches after tests to free space for subsequent matrix jobs
  3. Disk monitoring: Added df -h commands to track available space

These changes should prevent the sporadic "not enough storage" failures, especially with the large NER dependencies (Flair, Stanza) across 4 Python versions.

Copilot AI requested a review from omri374 December 8, 2025 20:39
@negruber1 negruber1 self-requested a review December 9, 2025 17:01
@omri374 omri374 merged commit cfbabf3 into master Dec 10, 2025
7 checks passed
@omri374 omri374 deleted the copilot/migrate-azure-pipelines-to-gh branch December 10, 2025 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants