Skip to content

Propose modernising hackage-server project#67

Open
qnikst wants to merge 1 commit into
haskellfoundation:mainfrom
tweag:tweag/proposal-modernising-hackage-server
Open

Propose modernising hackage-server project#67
qnikst wants to merge 1 commit into
haskellfoundation:mainfrom
tweag:tweag/proposal-modernising-hackage-server

Conversation

@qnikst
Copy link
Copy Markdown

@qnikst qnikst commented May 21, 2026

This commit introduces a proposal of the modernising hackage-server project by Tweag. The project includes a plan to improve hackage-server scalability and resource use by migration of the data store to relation database as well as a zero-downtime migration plan

Rendered document: 0000-modernising-hackage-server.md
Related discussion on Discourse: https://discourse.haskell.org/t/feedback-request-modernising-hackage-server-community-project-proposal/14142

This commit introduces a proposal of the modernising hackage-server project
by Tweag. The project includes a plan to improve hackage-server scalability and resource use
by migration of the data store to relation database as well as a zero-downtime
migration plan

### Migration Sequence

For the migration we 5 distinct phases:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noticed a slight typo!

@LaurentRDC
Copy link
Copy Markdown
Contributor

This is wonderful. I'm glad someone is taking a stab at this.

A few thoughts:

I'd like the proposal to go a bit further. Since incremental changes are hard on the current hackage-server, what are some of the ways in which hackage-server-v2 will be foward looking? How do we ensure that, in 10 years, there isn't a similar proposal for hackage-server-v3 because the architecture for hackage-server-v2 is lacking, or is hard to incrementally change?
Today's problem is horizontal scalability, and the proposal addresses that. What could be tomorrow's problem, and ensure that the new design allows for this to be solved? For example, the proposal mentions the use of IO () callbacks as being problematic due to an unclear control flow. What's the alternative being proposed here?

I strongly support the choice to go with Servant. Generating HTML pages is a bit annoying out-of-the-box, but the ability to create a hackage-server-api package is unparalleled.

Finally, one crucial detail to get right here, that I think needs to be addressed, is the solution around long-term data migrations once hackage-server-v2 is the source-of-truth. I'm not familiar with acid-state in practice, but I assume that it involves writing migrations in Haskell. SQL migrations can be painful if not managed appropriately

@hasufell
Copy link
Copy Markdown
Contributor

This is direly needed.

But I found the section about flora a bit handwavy... why exactly can this not be used to build a modern hackage-server? Have you reached out to @Kleidukos? It's possible these projects have largely different scope, but it's also possible this may cause more fragmentation that could have been avoided.

I also find it unclear who is going to maintain this project after the proposal is done and implemented.

@gbaz
Copy link
Copy Markdown
Collaborator

gbaz commented May 22, 2026

Flora has none of the APIs or backend necessary to be hackage. It is only a database and frontend. Most of the "juice" in hackage is the backend structures, not just the interface it provides to an existing database.

@gbaz
Copy link
Copy Markdown
Collaborator

gbaz commented May 22, 2026

On the whole I think this proposal is reasonable and addresses a real problem. The proposed architecture -- servant and postgres, is a standard and nice one that makes sense. That said, here are some comments.

HTML is generated manually throughout, as opposed to being a structured, templated system. This means it is prohibitively expensive to do any sort of modernizing of the generated documents, despite them conceptually being simple projections of the data.

This is not true. Many, though not all pages are generated using the hstringtemplate library, and the usage could be further pursued.

My main question is I don't understand the migration plan. The existing system will take all API requests, no? So how will the new system have the data to serve? Or is the idea all requests go to the new system and then it also "forwards" them to the old one? Additionally, will we need to run both servers at once on the existing hackage box? If so, will that cause even further resource costs on an already resource-starved box? Or is the plan to have a second box as well? (Which is fine, except then the filestore will need to be shared across boxes?).

An additional issue regarding just the proposal text (not the plan) is that we do not need horizontal scaling -- mirrors suffice for the most part, and we can build UI mirroring beyond that. What we need is to reduce the in-memory footprint. The motivation for switching datastores (a much needed thing, and thank you so much for looking at it!) is not scalability in the requests-per-second sense. I believe that a well-written hackage server could comfortably be served for quite some time on a much less beefy box than we now have, if it did not use acid-state. The motivation is just that the quantity of resident memory required by the current architecture is too high per each incremental package upload.

Finally, while on the whole I think a clean API-for-API rewrite would be ideal, I do wonder if there's another "middle balance" for now, which is to not swap the whole of the backend at once from acid-state to postgres, but to just swap the most expensive part, which I believe is the packagedb. It seems from skimming the migration document that most of the lines of code that require touching (20% or so in total) are not related to the packagedb, but rather to the user store, etc -- which are much less costly, I believe, to keep in acid-state for the time being.

A partial migration does not make hackage more horizontally scalable, but as I said above, it is not horizontal scalability that is our obstacle -- it is the single-box-cost of keeping too much data in memory.

@gbaz
Copy link
Copy Markdown
Collaborator

gbaz commented May 22, 2026

All that said, if a full rewrite can be done by two engineers in three months as this proposal states, then I think that we should absolutely go for it despite my reservations -- the cost-benefit analysis and my concerns are based on my experience of the very slow development of hackage in the past, and my fear of the scale of a large rewrite. So I would encourage the proposal submitters to really be sure they understand the scope of hackage well enough to give such an estimate (though the inventory of APIs and features indicates they have already thought about this.) If that is a genuine estimate of good engineers with sound timeline judgement, then that very much incentivizes going this path.

@hasufell
Copy link
Copy Markdown
Contributor

All that said, if a full rewrite can be done by two engineers in three months

Maybe we should ask directly: is Tweag planning to use AI assistance and if so in what shape or form?

I don't see an LLM contribution policy in the hackage-server project, but this is probably useful to clear up anyway.

@qnikst
Copy link
Copy Markdown
Author

qnikst commented May 22, 2026

Thanks for replies. I'll try to address them:

@L0neGamer:

I'd like the proposal to go a bit further. Since incremental changes are hard on the current hackage-server, what are some of the ways in which hackage-server-v2 will be foward looking? How do we ensure that, in 10 years, there isn't a similar proposal for hackage-server-v3 because the architecture for hackage-server-v2 is lacking, or is hard to incrementally change?

Any reply here would be a bit philosophical. There can be no reply that will convince everyone, as there is no agreement on what right or wrong in the community. What we can guarantee that Tweag will use the best (and safe) practices as of 2026 (and not use too experimental approaches). The very least we will split the storage/query layers, so it would be possible to change the implementation w/o affecting other layers of the server implementation, and care about documentation. We believe that the proposed incremental approach to migration will ensure that codebase is modifiable without crucial rewriting, so there will be no need in the similar proposal.

I'm not familiar with acid-state in practice, but I assume that it involves writing migrations in Haskell. SQL migrations can be painful if not managed appropriately

These years we prefer to use rel8 for working with database (we already have a proof of concept for that) and sqitch for migrations. Both were used in various Haskell projects, ensuring the sustainability of the solution.

@hasufell, with regards to the flora.pm, yes we definitely in contact with @Kleidukos, at the point (may 2026) flora.pm has some features that are not compatible with hackage (e.g. because of namespace support). And no background tasks coverage. If we continue with flora.pm keeping the hackage-server as it it will require significant work as in the flora.pm, but also update tooling that will have to support modern API. With all the respect to flora.pm that I believe is very important project for entire Haskell Ecosystem, modernising hackage-server looks like the better strategy in terms of efficiency and required investments.

I also find it unclear who is going to maintain this project after the proposal is done and implemented.

We expect that the proper long term strategy is that Haskell Foundation should own hackage-server, as it's important that the core infrastructure does not depend on a single entity. But Tweag will support code maintenance and address the bugs as much as we can.

@gbaz

HTML is generated manually throughout....
This is not true. Many, though not all pages are generated using the hstringtemplate library, and the usage could be further pursued.

Thanks! We will remove the false statement. And on the course of the implementation will check what will be the best way forward whether to pursue it further, or there will be safer/more efficient approach.

My main question is I don't understand the migration plan.

We will need to update the document to be more explicit, but long story short, we expect to have a second box on duration of the migration, the only complex part is sharing an access to data storage during the first step of the migration. But this problem has nice known solutions.

... I do wonder if there's another "middle balance" for now, which is to not swap the whole of the backend at once from acid-state to postgres, but to just swap the most expensive part, which I believe is the packagedb.

This was a part of the migration plan, we first move package db, and move usersdb as a separate step. But when you mentioned that, I start to think that this step will be a great milestone in our work. When we wrote a proposal we have not anticipated that, and saw benefit to community only when all the work will be done. We look forward to do complete rewrite and current approach to working with data still sets some limitations. But I think it worth explicitly mention the milestone.

... estimates ...

With regards to the timing. Initial very safe assumption after initial work as was 6 month 3 developers, but this will be a too costly request. With the experience of the similarly looking packages and concrete plan 3 month 2 devs is optimistic but still possible assumption for the interative migration, even without any AI-tools being involved. (Though it's possible if we have unknown unknowns we will be able to deliver only the packagedb related milestone in that time.)

@hasufell and we do not plan to use agentic approach for any code rewrite, where rewrite itself is done solely or largely using AI tools.


Following actions from us:

  • add information about migrations.
  • remove statement about html generation.
  • add details about the migration steps and requirements.
  • add details if it's feasible to move only packagedb related parts to a relational database.

I'll add another comment once we complete those actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants