-
Notifications
You must be signed in to change notification settings - Fork 1.4k
github_actions: monthly check for broken hyperlinks #7537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
using https://github.com/lycheeverse/lychee Signed-off-by: sspaink <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picking a random ./docs
error in the issue generated for your fork:
[ERROR] file:///home/runner/work/opa/opa/docs/contrib-development#fork-clone-create-a-branch | Cannot find file
This is a a valid link once the page has been generated. Maybe lychee (delicious by the way) isn't fit for checking these kinds of links, and they should be excluded. Alternatively, we could check the generated page, but that would require us to build and run it first (maybe through netlify)..
Another random external link error:
[500] https://ceph.io/ | Network error: Internal Server Error
This site seems to be alive an well now. I don't see this kind of flakiness as a problem as long as we don't run this task often enough for it to become annoying; and if we don't need to sift through hundreds of false positives for local links.
.github/workflows/link_checker.yaml
Outdated
repository_dispatch: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: "0 13 * * 1" # Every Monday at 1PM UTC (9AM EST) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe once a month is enough?
This is going to be a build time check when the new site drops btw. we also have broken anchor checking as non blocking which I plan to address when that's merged. I am unsure we need to async check but no harm I guess. |
@charlieegan3, do you mean that all the checks done here will be made on a per-PR basis with the new site, or only page-local linking? |
Signed-off-by: sspaink <[email protected]>
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
some changes in the latest commit, tried it out in my fork so you can see the example issue it will create: sspaink#12
|
Ahh, actually Docusaurus doesn't do this (I thought it did). And yeah, makes sense not to block PRs. |
Wow, that's a lot of broken links. Did some unscientific investigation through random clicking; and indeed, all the ones I tried had the reported issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
resolve: #3249
Check for broken hyperlinks in markdown files using: https://github.com/lycheeverse/lychee
How could I not pick the project with an adorable/creepy smiling lychee as the mascot.
Using this github action to run lychee:
https://github.com/lycheeverse/lychee-action
I tested this workflow out on my fork: https://github.com/sspaink/opa/blob/main/.github/workflows/link_checker.yaml
Manually triggered it: https://github.com/sspaink/opa/actions/runs/14742343402
Then it created this issue: sspaink#5
This workflow will run on Monday, if any errors are found it will open a single issue like the one above with a summary. The ADOPTERS.md has some links that return 403 but are valid, so added 403 as an accepted status. Hopefully that won't cause any problems, could also add exceptions for the URLs themselves.
There are also lots of issues in the docs folder that seem valid, but maybe I am missing something about how the docs work. With the upcoming docs revamp not sure if these errors should be ignored. @charlieegan3 advice on this would be appreciate 😄
You can also run the tool locally:
Originally experimented with https://github.com/UmbrellaDocs/linkspector but that tool is designed to review incoming PRs. Which could also be useful to prevent bad links from being submitted but lychee seems faster and is already setup to be used as the issue describes to run weekly.