Skip to content

Conversation

@chinyeungli
Copy link
Contributor

This new PR is originally based on #796

Since that PR does not include the latest changes from main and would require significant effort to merge, this branch instead adopts the main and reapplies the updates from #796, along with additional enhancements and refactoring.

Summary

  • Updated the mine_maven.py pipeline and maven_crawler.py
  • Added Maven Root/Base URLs
  • Updated the regular expressions used to collect links and artifact timestamps in maven_crawler.py
  • Updated the parameter needed for get_classifier_from_artifact_url()
  • Added optional fields for users to choose which repos to mine
Screenshot 2025-12-22 164340

* Use the refactored code from main branch
* Adopted the changes that's made earlier
* Add "optional_step" in pipeline to let user to choose which repo to be mined.

Signed-off-by: Chin Yeung Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant