Skip to content

Conversation

@lhofhansl
Copy link
Contributor

This is a prototype to de-serialize blocks loading from disk without holding the envlock.
This speeds up map loaded, especially with multiple emerge threads or on a busy server (should lead to better CPU utilization).
(Note that light spread and fluid scanner read from the map and cannot be performed without holding the envlock)

To do

This PR is a Work in Progress.

  • Testing
  • Concept review

How to test

Join any map. Measure load time (and/or lock at "ServerMap: deSer block" in the profiler)

@Desour
Copy link
Member

Desour commented Dec 20, 2025

Generally 👍 for doing mapblock deserialization not always in the main thread.


Unfortunately, deSerializeBlock() is not thread safe, mapblocks have a gamedef ptr and use it:

  • deSerializeBlock() uses it to allocate unknown node ids.
  • And as other threads, including the main thread may allocate new node ids, any access to the name id mapping is thread unsafe!
  • I'm not sure, but unknown items might cause similar issues.
  • The deSerialize_pre22() code also accesses the gamedef in different ways.

These things could probably be fixed either by only doing parts of the deserialization without envlock, or by using a read-write lock.

Edit: Or split the name id mapping into one for the static nodes, and one for the ones dynamically added (i.e. unknown nodes). Lookup would then be: lookup in static one, if not found (unlikely) lock dynamic and lookup in dynamic
(Could also lock all threads every now and then to transfer dynamic into static.)


Understanding getBlockOrStartGen() becomes more and more difficult. We call it more than once and there are so many code paths.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Dec 20, 2025

deSerializeBlock() is not thread safe

Hmm... Right. Can't think of an elegant way to fix it, though. Even a clone is not good enough, since other threads needs to know a about new node ids; and we can't pre-allocate ids either since we do not know what ids are missing before we deserialize. Only chance is to make it threadsafe and that will probably have a bad performance impact.

Update: Looks like only the IItemDefManager is referred to, and only in ItemStack::deSerialize. Buried pretty deep in the deserialization call stack:
MapBlock > NodeMetadataList > NodeMetadata > Inventory > InventoryList > ItemStack

and also: correctBlockNodeIds called from MapBlock::deSerialize

Meh. No easy fix.

I'll leave this open for more discussion.

Understanding getBlockOrStartGen() becomes more and more difficult.

Agreed. It's time to refactor the whole thing.

@lhofhansl
Copy link
Contributor Author

Closing, unless someone has some bright ideas.

@lhofhansl lhofhansl closed this Dec 21, 2025
@sfan5
Copy link
Member

sfan5 commented Dec 21, 2025

Now that you mention it we might already have a decent amount of unsafety if the server thread allocates an unknown nodes ID while other threads are using the same NodeDefManager 🤔

@Desour
Copy link
Member

Desour commented Dec 21, 2025

Only chance is to make it threadsafe and that will probably have a bad performance impact.

Closing, unless someone has some [...] ideas.

I've edited my comment with a 3rd idea to fix it, idk if you've seen it. :) (Only lock in case of unknown nodes, by using a 2nd data structure.) I think it could work out. Thoughts?

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Dec 23, 2025

@sfan5 All NodeDefManager accessed is guarded by the envlock, no? So hopefully there, currently. is no correctness problem.

@Desour Yeah. That could work. Not sure I get the difference between dynamic and static, which ones do you consider static.

I feel like we are doing threading patchwork. We need an overall strategy for threading.

@lhofhansl lhofhansl reopened this Dec 28, 2025
@lhofhansl lhofhansl changed the title [TRY OUT] Deserialize block from disk without envlock [PoC] Deserialize block from disk without envlock Dec 28, 2025
@lhofhansl lhofhansl marked this pull request as draft December 28, 2025 15:59
@sfan5
Copy link
Member

sfan5 commented Dec 28, 2025

@sfan5 All NodeDefManager accessed is guarded by the envlock, no?

No, not at all. NodeDefManager::get is everywhere.

@lhofhansl
Copy link
Contributor Author

lhofhansl commented Dec 28, 2025

@sfan5 @Desour I reopened this as a proof of concept.
Instead of making IGameDef et al thread safe, this splits the deserialization into two parts.
Decompressing and the first part of deserialization can be done without locks.

Together with #16783 and #16739 this can now load up 115,000 blocks per second, which is close to the maximum sqlite speed on my machine.

Do not merge! :) Just a PoC, misses version checks in the emerge code, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants