-
Notifications
You must be signed in to change notification settings - Fork 287
Check for 404 redirect URLs #2190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for substrate-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Do I understand this correctly that we would only need to run this once? I was wondering why would it need to be run as a github action if it can be run a few times locally to do the final cross checks. |
As I mentioned in matrix - IMO would be more correct to use .netlify/_redirects as source of URLs and docs.substrate.com as a domain, if the goal is to check the redirects, as it states in the description. Currently it checks the list of hardcoded destinations, which are not redirects in fact. Later if anyone is going to fix any of 404 - there's a high risk of unsync / mistake due to a need of supporting 2 sources if that's something you're aware of and ok, we can for sure merge as is cc @RemyLeBerre |
We'd have to run it periodically. The goal is to be able to track dead redirects easily. I'd also like to avoid reliance on me, if possible. We still have to decide until when to keep track of it ourselves.
Afaik, there'll be no more changes to the redirects, or this repo overall. Regardless, it's a good call to ask. @kianenigma @RemyLeBerre do you think it's necessary to rely on the |
the main concern, is that you say redirects, but you're checking against destination, so you're not actually checking redirects :) the url you're checking is https://docs.polkadot.com/develop/parachains/customize-parachain end for example very first redirect in the _redirects file - http://docs.substrate.io/build/application-development/ |
After a discussion with @mordamax , we decided to use the pipeline to check against the |
response=$(curl -L -o /dev/null -s -w "%{http_code}" -m 10 "$destination") | ||
|
||
if [ "$response" = "404" ]; then | ||
echo "::warning::$destination" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this ain't going fail/report anywhere?
may be collect 404s into array and if it's empty exit 0, if there are elements 1 ?
alternatively send an email/matrix message to room?
otherwise im curious how will we not forget to check and analyze it periodically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this ain't going fail/report anywhere?
otherwise im curious how will we not forget to check and analyze it periodically?
@RemyLeBerre and I have a recurring weekly call to go over this together (it's one of the call topics). It's a temporary check, and I expect we'll archive the repository/project before we stop having the call.
may be collect 404s into array and if it's empty exit 0, if there are elements 1 ?
alternatively send an email/matrix message to room?
I'd rather we try to avoid adding any integrations, since it's going to be decommissioned pretty soon. I think the same goes for the error code/status. I'd prefer classifying it as a warning, since it's not really a redirect error, and will be fixed on the other side (polkadot.com). This is more a source of information, rather than a trigger to do something in this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT 🚀
Create a GitHub Actions Workflow that will check for 404 response codes for the list of redirect URLs.
This will help in tracking the missing URLs, primarily to prevent losing SEO ranking for the corresponding pages.
The script will run as a cron job on Mondays at 2pm UTC, on push, and as a manual action.