Skip to content

Add GitHub action that deploys GitHub Pages with redirects to package repositories#120760

Merged
ericphanson merged 21 commits intoJuliaRegistries:masterfrom
jkrumbiegel:master
Dec 10, 2024
Merged

Add GitHub action that deploys GitHub Pages with redirects to package repositories#120760
ericphanson merged 21 commits intoJuliaRegistries:masterfrom
jkrumbiegel:master

Conversation

@jkrumbiegel
Copy link
Contributor

@jkrumbiegel jkrumbiegel commented Dec 5, 2024

As discussed here https://discourse.julialang.org/t/automatic-package-links/106275 and here https://discourse.julialang.org/t/automatic-package-link-pages-have-poor-layout/123087, we currently seem to be lacking a good way to link to Julia packages when we only know the package name. There's the JuliaHub UI but there's a lot of controversy around linking to JuliaHub by default. The whole docs generation thing was a bit of a mess.

So my proposal is to just add a GitHub Pages to the General Registry repo which contains one HTML page per package. That page contains a redirect header that redirects immediately to the repo that the registry stores for that package. I've tested this workflow on my fork https://github.com/jkrumbiegel/General/ and the deployed page allows to link to package repos with links like

(Edit: changed the format to be more clear about redirection)

https://jkrumbiegel.com/General/packages/redirect_to_repo/Makie
https://jkrumbiegel.com/General/packages/redirect_to_repo/DifferentialEquations
https://jkrumbiegel.com/General/packages/redirect_to_repo/InPartS (this one will not auto-redirect due to the uncommon host)

and so on. Discourse could then change its linkifier logic to use such links. Currently the same format for this repo would be

https://juliaregistries.github.io/General/packages/redirect_to_repo/Makie
https://juliaregistries.github.io/General/packages/redirect_to_repo/DifferentialEquations
https://juliaregistries.github.io/General/packages/redirect_to_repo/InPartS

but if desired a nicer domain could be chosen so this becomes something like

https://juliaregistry.com/Makie
https://juliaregistry.com/DifferentialEquations
https://juliaregistry.com/InPartS

Currently the workflow only has manual trigger logic, but a cron trigger should be added at some reasonable interval. One deploy currently takes about 40 seconds.

cc @MasonProtter

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 8, 2024

Currently the same format for this repo would be

https://juliaregistries.github.io/General/Makie

Can we make the URL communicate more clearly that a redirect is happening? Maybe something like this instead?

https://juliaregistries.github.io/General/packages/redirect_to_repo/Makie

@DilumAluthge
Copy link
Member

Julius and I discussed this - briefly, we'll do an automatic redirect for known-hosts such as https://github.com/* and https://gitlab.com/*. For other websites, instead of automatically redirecting, we'll display the URL to the user, and the user will be asked whether or not they want to continue to the URL.

@jkrumbiegel
Copy link
Contributor Author

jkrumbiegel commented Dec 8, 2024

@DilumAluthge I've implemented all your review comments. I've added github.com, gitlab.com and codeberg.org as known hosts, the full list was very short actually:

Dict{String, Int64} with 6 entries:
  "codeberg.org"   => 4
  "git.sr.ht"      => 2
  "github.com"     => 11580
  "gitlab.com"     => 61
  "gitlab.gwdg.de" => 4
  "gitlab.emse.fr" => 1

Even though codeberg.org and gitlab.com have few entries they seemed more "common" than the other ones to me so I included them.

You can have a look at the updated format at these links on my fork

https://jkrumbiegel.com/General/packages/redirect_to_repo/Makie
https://jkrumbiegel.com/General/packages/redirect_to_repo/DifferentialEquations
https://jkrumbiegel.com/General/packages/redirect_to_repo/InPartS (this one will not auto-redirect due to the uncommon host)

@giordano
Copy link
Member

giordano commented Dec 8, 2024

Why this needs to be in this repository since it only needs to read and it's scheduled? Can't this live in a different repo?

@jkrumbiegel
Copy link
Contributor Author

It could live anywhere, that's true. I thought that being associated to the official registry would fit if its used by default in discourse. But we can move it somewhere else if we'd rather leave this repo untouched.

@giordano
Copy link
Member

giordano commented Dec 8, 2024

The thing is that these files would end up being downloaded in everyone's registries, for no reason. There are of course many other github actions workflows in this repo, but I believe they all need to operate on this repo, not just read it.

A new home for this workflow could be a repo called something like "GeneralRedirects" which would also address the comment above about including "redirects" somewhere in the URL.

@DilumAluthge
Copy link
Member

That's a good point. There's no need for us to include these HTML files in the registry that everyone downloads.

@DilumAluthge
Copy link
Member

I've created a separate repo: https://github.com/JuliaRegistries/GeneralRedirects

@ericphanson
Copy link
Member

ericphanson commented Dec 8, 2024

One disadvantage of a separate repo is:

In a public repository, scheduled workflows are automatically disabled when no repository activity has occurred in 60 days. For information on re-enabling a disabled workflow, see "Disabling and enabling a workflow."

So someone will have to go and reenable that workflow every 2 months. I think if it were here, the activity in General would keep it from being disabled.

@DilumAluthge
Copy link
Member

Okay, so, crazy idea: the workflow lives in this repo (and thus never needs to be re-enabled), but we push the docs to the separate repo (GeneralRedirects) using e.g. an SSH deploy key.

So we never need to re-enable the workflow, but the actual HTML files never get saved in this repo, because we push them to the GeneralRedirects repo.

@ericphanson
Copy link
Member

could the workflow live here, but it pushes to https://github.com/JuliaRegistries/GeneralRedirects instead of this repo's github pages?

@ericphanson
Copy link
Member

jinx 😄. That seems like a good approach to me.

@jkrumbiegel
Copy link
Contributor Author

@DilumAluthge will do

giordano
giordano previously approved these changes Dec 10, 2024
ericphanson
ericphanson previously approved these changes Dec 10, 2024
Copy link
Member

@ericphanson ericphanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm; I had some small comments about title/colors but those could always be refined in follow-ups and community feedback, so I’m also ok with merging as-is

@jkrumbiegel jkrumbiegel dismissed stale reviews from ericphanson and giordano via cde5390 December 10, 2024 09:51
@jkrumbiegel
Copy link
Contributor Author

Ok I think I've addressed all review comments. The thing now has a Julia-purple border instead of the random blue :)

grafik

@ericphanson
Copy link
Member

Great, let's go ahead and merge this and we can make tweaks if needed in followups. We should keep an eye on how often it runs, I think basically every PR will trigger this so we might want to switch it back to a cron to avoid wasting compute. But I think we can see how it goes first.

@ericphanson ericphanson merged commit c97667a into JuliaRegistries:master Dec 10, 2024
8 checks passed
@jkrumbiegel
Copy link
Contributor Author

I checked the last couple merges on master and none of those had Package.toml changes. It's mostly versions and compat. So it won't trigger every time at least. And it's pretty quick.

@jkrumbiegel
Copy link
Contributor Author

Oh and some admin needs to actually enable the github pages build via actions from the repo settings, otherwise the workflow will fail.

@ericphanson
Copy link
Member

ah, yeah: https://github.com/JuliaRegistries/General/actions/runs/12254152092

@DilumAluthge can you enable github pages when you get a chance?

and @jkrumbiegel can you make a PR to move the redirect script to the .ci folder? #120760 (comment)

@fredrikekre
Copy link
Member

fredrikekre commented Dec 10, 2024

I enabled it but the rerun failed because now there are two artifacts? One for the original run and one for the rerun...

Edit: Started a new run which seems to have passed.

@fredrikekre
Copy link
Member

Perhaps it would be neat with some intermediate pages so that for example https://juliaregistries/github.io doesn't 404?

@MasonProtter
Copy link
Contributor

It'd also be good if there was a fallback page for e.g. https://juliaregistries.github.io/General/packages/redirect_to_repo/xyz that tells users something like

xyz.jl is not a registered package in the julia General Registry. Would you like to search Github, Gitlab, Google, or DuckDuckGo?

@tecosaur
Copy link
Contributor

Just a quick question, are dedicated redirect pages needed instead of directly serving 302 redirect responses?

@jkrumbiegel
Copy link
Contributor Author

I don't think you can do that without setting up your own server. This was explicitly a low tech solution where that's not necessary :)

@tecosaur
Copy link
Contributor

Righteo, I held out a small hope that there would be some way to define a static routing/redirect file with GH pages 🙂

@giordano
Copy link
Member

giordano commented Dec 10, 2024

I think basically every PR will trigger this so we might want to switch it back to a cron to avoid wasting compute.

Over last 10k commits to this repository, only about 7% (711 to be more precise) touched a Project.toml file, that doesn't look like a lot to me. This is more frequent than running a schedule job every 6 hours (last 10k commits go back to 2024-07-07, which was 156 days ago, 4 jobs a day would be 624 in total), but more accurate.

Script used to count the number of commits
#!/bin/bash

REGISTRY_DIR="${HOME}/.julia/registries/General"
COMMITS=$(git -C "${REGISTRY_DIR}" log --oneline --format=%H -10000)

i=0
for commit in ${COMMITS}; do
    if [[ $(git -C "${REGISTRY_DIR}" diff-tree --diff-filter=d --name-only -r "${commit}^" "${commit}") == *Package.toml* ]]; then
        ((i+=1))
    fi
done
echo "${i} commits touched a Package.toml file"

This took about a minute to run on my laptop, hold tight.

Edit:

Parallelised Julia script to do the same:

using Base.Threads: nthreads
using OhMyThreads: tmapreduce

function main()
    registry_dir = joinpath(homedir(), ".julia", "registries", "General")
    commits = readlines(`git -C $(registry_dir) log --oneline --format=%H -10000`)

    ncommits = tmapreduce(+, commits; ntasks=nthreads()) do commit
        modified_files = readchomp(`git -C $(registry_dir) diff-tree --diff-filter=d --name-only -r $(commit)^ $(commit)`)
        Int(contains(modified_files, "Package.toml"))
    end
    println("$(ncommits) commits touched a Package.toml file")
end

main()

@DilumAluthge
Copy link
Member

This is nice. Thank you @jkrumbiegel!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants