-
Notifications
You must be signed in to change notification settings - Fork 4k
[ci][docs] fix link checking action by switching from linkchecker to lychee and update some links #7027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work! This is awesome, I'm glad to see so many broken links fixed and outdated documentation removed.
I totally support switching to lychee. I've been using it in pydistcheck for a little while and really liking it.
Would you consider (can be a follow-up) using https://github.com/lycheeverse/lychee-action instead?
Benefits of this:
- could remove more code from
test-docs.sh - it's very fast (no need for conda, for example)
Here's an example of a setup like that: jameslamb/pydistcheck#312
I'm approving this PR because I'd be happy to see it merged as-is. Everything I've suggested could be follow-up PRs, if you agree with the suggestinos.
.ci/test-docs.sh
Outdated
| # to see all gained links add "--dump" flag | ||
| lychee \ | ||
| "--config=./docs/.lychee.toml" \ | ||
| "--exclude-path=(^|/)docs/.*\.rst" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add this to lychee.toml: https://lychee.cli.rs/recipes/excluding-paths/
The more that gets put into that file, the easier it'd be to run this interactively during development.
|
@jameslamb Thanks a lot for reviewing this! Nice to hear that you are already familiar with Would it be OK for you if I push changes for GitHub token and paths exclusion in this PR, and move this to GitHub Action in a follow-up PR? |
Yes absolutely! |
|
@jameslamb I've pushed changes I promised you in #7027 (comment). Could you please check them when have time? Please notice I have to add stackoverflow site to ignore list due to the following errors: Now results look like the following: |
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great... all of my suggestions have been addressed and I'm happy to see this only takes around 2 minutes to run (build link) compared to nearly an hour for the linkcheck-based job (build link)!!!
Yeah, thanks to GitHub token lychee mechanism we don't need to manually slow down link checking to avoid 429 Rate limit errors. |
All our recent link check jobs are failing: https://github.com/microsoft/LightGBM/actions/workflows/linkchecker.yml. Last successful job was 5 months ago.
The most errors are about 429 Rate Limit GitHub urls, but others are quite random ones. It looks like our current
linkcheckerisn't reliable enough.Initially I thought to use
linkinatorbut then foundlychee. This tool looks quite simple to use but powerful.Latest
linkcheckerresults:New
lycheeresults:I left some inline comments below to help understand some non obvious moments.