Propose modernising hackage-server project#67
Conversation
This commit introduces a proposal of the modernising hackage-server project by Tweag. The project includes a plan to improve hackage-server scalability and resource use by migration of the data store to relation database as well as a zero-downtime migration plan
|
|
||
| ### Migration Sequence | ||
|
|
||
| For the migration we 5 distinct phases: |
|
This is wonderful. I'm glad someone is taking a stab at this. A few thoughts: I'd like the proposal to go a bit further. Since incremental changes are hard on the current I strongly support the choice to go with Servant. Generating HTML pages is a bit annoying out-of-the-box, but the ability to create a Finally, one crucial detail to get right here, that I think needs to be addressed, is the solution around long-term data migrations once |
|
This is direly needed. But I found the section about flora a bit handwavy... why exactly can this not be used to build a modern hackage-server? Have you reached out to @Kleidukos? It's possible these projects have largely different scope, but it's also possible this may cause more fragmentation that could have been avoided. I also find it unclear who is going to maintain this project after the proposal is done and implemented. |
|
Flora has none of the APIs or backend necessary to be hackage. It is only a database and frontend. Most of the "juice" in hackage is the backend structures, not just the interface it provides to an existing database. |
|
On the whole I think this proposal is reasonable and addresses a real problem. The proposed architecture -- servant and postgres, is a standard and nice one that makes sense. That said, here are some comments.
This is not true. Many, though not all pages are generated using the hstringtemplate library, and the usage could be further pursued. My main question is I don't understand the migration plan. The existing system will take all API requests, no? So how will the new system have the data to serve? Or is the idea all requests go to the new system and then it also "forwards" them to the old one? Additionally, will we need to run both servers at once on the existing hackage box? If so, will that cause even further resource costs on an already resource-starved box? Or is the plan to have a second box as well? (Which is fine, except then the filestore will need to be shared across boxes?). An additional issue regarding just the proposal text (not the plan) is that we do not need horizontal scaling -- mirrors suffice for the most part, and we can build UI mirroring beyond that. What we need is to reduce the in-memory footprint. The motivation for switching datastores (a much needed thing, and thank you so much for looking at it!) is not scalability in the requests-per-second sense. I believe that a well-written hackage server could comfortably be served for quite some time on a much less beefy box than we now have, if it did not use acid-state. The motivation is just that the quantity of resident memory required by the current architecture is too high per each incremental package upload. Finally, while on the whole I think a clean API-for-API rewrite would be ideal, I do wonder if there's another "middle balance" for now, which is to not swap the whole of the backend at once from acid-state to postgres, but to just swap the most expensive part, which I believe is the packagedb. It seems from skimming the migration document that most of the lines of code that require touching (20% or so in total) are not related to the packagedb, but rather to the user store, etc -- which are much less costly, I believe, to keep in acid-state for the time being. A partial migration does not make hackage more horizontally scalable, but as I said above, it is not horizontal scalability that is our obstacle -- it is the single-box-cost of keeping too much data in memory. |
|
All that said, if a full rewrite can be done by two engineers in three months as this proposal states, then I think that we should absolutely go for it despite my reservations -- the cost-benefit analysis and my concerns are based on my experience of the very slow development of hackage in the past, and my fear of the scale of a large rewrite. So I would encourage the proposal submitters to really be sure they understand the scope of hackage well enough to give such an estimate (though the inventory of APIs and features indicates they have already thought about this.) If that is a genuine estimate of good engineers with sound timeline judgement, then that very much incentivizes going this path. |
Maybe we should ask directly: is Tweag planning to use AI assistance and if so in what shape or form? I don't see an LLM contribution policy in the hackage-server project, but this is probably useful to clear up anyway. |
|
Thanks for replies. I'll try to address them:
Any reply here would be a bit philosophical. There can be no reply that will convince everyone, as there is no agreement on what right or wrong in the community. What we can guarantee that Tweag will use the best (and safe) practices as of 2026 (and not use too experimental approaches). The very least we will split the storage/query layers, so it would be possible to change the implementation w/o affecting other layers of the server implementation, and care about documentation. We believe that the proposed incremental approach to migration will ensure that codebase is modifiable without crucial rewriting, so there will be no need in the similar proposal.
These years we prefer to use @hasufell, with regards to the flora.pm, yes we definitely in contact with @Kleidukos, at the point (may 2026) flora.pm has some features that are not compatible with
We expect that the proper long term strategy is that Haskell Foundation should own
Thanks! We will remove the false statement. And on the course of the implementation will check what will be the best way forward whether to pursue it further, or there will be safer/more efficient approach.
We will need to update the document to be more explicit, but long story short, we expect to have a second box on duration of the migration, the only complex part is sharing an access to data storage during the first step of the migration. But this problem has nice known solutions.
This was a part of the migration plan, we first move package db, and move usersdb as a separate step. But when you mentioned that, I start to think that this step will be a great milestone in our work. When we wrote a proposal we have not anticipated that, and saw benefit to community only when all the work will be done. We look forward to do complete rewrite and current approach to working with data still sets some limitations. But I think it worth explicitly mention the milestone.
With regards to the timing. Initial very safe assumption after initial work as was 6 month 3 developers, but this will be a too costly request. With the experience of the similarly looking packages and concrete plan 3 month 2 devs is optimistic but still possible assumption for the interative migration, even without any AI-tools being involved. (Though it's possible if we have unknown unknowns we will be able to deliver only the packagedb related milestone in that time.) @hasufell and we do not plan to use agentic approach for any code rewrite, where rewrite itself is done solely or largely using AI tools. Following actions from us:
I'll add another comment once we complete those actions. |
This commit introduces a proposal of the modernising hackage-server project by Tweag. The project includes a plan to improve hackage-server scalability and resource use by migration of the data store to relation database as well as a zero-downtime migration plan
Rendered document: 0000-modernising-hackage-server.md
Related discussion on Discourse: https://discourse.haskell.org/t/feedback-request-modernising-hackage-server-community-project-proposal/14142