-
Notifications
You must be signed in to change notification settings - Fork 142
fix outdated homepage urls #1365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
If this helps, here is a script that can do QA on the PR: if you pipe the contents of the patch file into it (e.g. |
|
Interesting stats: after this PR, Tigerbrew still contains:
|
|
Thank you for this! I'll try to review this soon. |
1d8124a to
209cbc2
Compare
|
I found a few of the new URLs that I don't love, doing some spot checks this weekend. Let me merge one more commit today before you spend any time looking at this. Thank you! |
|
Sounds good. Thank you again for all the work on this! |
03fc6ac to
d0a5424
Compare
|
Okay, thanks for your patience. A handful of Sourceforge homepages were outdated but not returning 404s or redirects, and a couple of homepages were redirecting to spam sites on urls that had expired. I fixed these, and also standardized all sourceforge URLs to https://project-name.sourceforge.net/ wherever possible. Along the way I found a few more packages where the source project may have gone missing, and added those to my list to investigate later. I think this is ready to be looked at now. Let me know if there's anything I can do to make it easier to review, of course. |
|
One idea I had: I could split this into two PRs. Most of these updates are simply If that would help, say the word, and I'll go figure out some git stuff. |
|
@mistydemeo there's now 3 pull requests which touch more than a thousand files. Shall we split it letter by letter between us to review? |
|
If you want, I'd happy to divide these into single-letter PRs. |
Single PR for a change is fine, was just concerned about having to review one large diff which touches a thousand plus files. |
7afb87e to
e55709b
Compare
|
That makes total sense. Here - I split this PR up into single-letter commits, and I'll go do the same to #1374. Thanks for the suggestion! |
|
Thanks for the change. Will help to review these over the next couple of days. |
|
A, B, C, D, E, S done. |
|
What do you think is the best practice for SourceForge homepage URLs? SourceForge looks to have at least four types of URLs for a project:
Based on whether a project uses its web space, the first two URLs will do If there isn't a webspace, I'm thinking of an approach where we follow Taking this slightly further, if the homepage for a formula is already in the form of https://sourceforge.net/projects/qstat/, but a url like https://qstat.sourceforge.net/ exists and returns a Some stats about what's in the codebase today. If there's a preference for one approach or another, I could amend this PR to be more standardized than I've been.
|
Some of your replacements are valid, hence I have left those alone. |
|
I'm going through and doing letter-by-letter manual review of these, aiming to proactively fix the kinds of issues you've spotted in earlier letters. I'm about six letters ahead of you now :). Apologies that this makes the patch messy. If you're using the patch to review this and it's annoying to look in two places, let me know and I can try to do some git shenanigans to clean that up. |
...and, done! This should be much easier to review now - I've confirmed every homepage here is an actual homepage for the software in question, or the most current archive.org snapshot of a page if I wasn't able to find an extant one. This leaves around 350 homepage entries in formulas that are invalid in some way, but where there aren't redirects in place to follow. Using the Wayback Machine API I have a process where I can pretty quickly find the latest valid Wayback Machine snapshot for each of these, so I'll go ahead and do that next. I'll commit these letter-by-letter as I fix them. |
a5afefd to
8aa9aed
Compare
|
Okay - with those updates in place, I've now confirmed that every tigerbrew homepage URL that didn't return an http |
|
Well done. |
a94afe0 to
16b634a
Compare
|
Reorganized! |
I've been doing some audits of Tigerbrew formulas, and it looks like a number of the URLs are out of date. I suspect that updating URLs will restore some packages that currently don't build to being functional again. I'm trying this in phases.
This first commit updates
homepageurls, as these are informational rather than functional. The changes here are:httpurls tohttpsif anhttpsurl was available - this was done with automation.301(permanent) redirects - this was done with automation302(temporary) redirects,40Xerrors, errors and otherwise nonfunctional websites, and manually fixed the ones that could be fixed. I either followed links on the existing websites, or used other URLs in the formula (such as theheadurl) to find the existing project. I didn't update any302redirects that looked truly temporary - for example, a lot of sourceforge projects have what seems to be a canonical url that links to the project's specific page, and I left those canonical urls in place.I realize this is a massive commit - 1,793 files - which makes it beyond unwieldy to manually review. I'm happy to share some of the scripts I've been using to look for homepage urls with issues, which can be used to check the work here. Or to break the commit up into smaller pieces, or whatever might help.