Skip to content

Add Lua API function to retrieve node counts for an area#16948

Open
kromka-chleba wants to merge 4 commits intoluanti-org:masterfrom
kromka-chleba:node-mappings
Open

Add Lua API function to retrieve node counts for an area#16948
kromka-chleba wants to merge 4 commits intoluanti-org:masterfrom
kromka-chleba:node-mappings

Conversation

@kromka-chleba
Copy link
Contributor

@kromka-chleba kromka-chleba commented Feb 15, 2026

Goal of the PR

To add a way to get the number of nodes of each type for a given mapblock from Lua by introducing the core.get_node_counts_in_area(pos1, pos2, nodenames) function. This allows modders to know the contents of an area without having to use ABMs/LBMs or manually iterate over the VM array. Together with #16910 this would allow modders to implement custom ABM/LBM-like systems without the limitations of ABMs/LBMs.

Does this relate to a goal in the roadmap?

Probably not.

If you have used an LLM/AI to help with code or assets, you must disclose this.

Copilot helped me implement this.

To do

This PR is Ready

  • Test in the game
  • Read and verify unit tests
  • Fix whitespace errors

How to test

Use the /test_get_node_counts command.

-- Example 1: Count a single node type
-- Use case: Check how much stone is in a mining area
local function example_single_node()
	local minp = {x = 0, y = 0, z = 0}
	local maxp = {x = 10, y = 10, z = 10}
	
	-- Count only stone nodes
	local counts = core.get_node_counts_in_area(minp, maxp, "default:stone")
	
	print("Stone count: " .. (counts["default:stone"] or 0))
end

example_single_node()

-- Example 2: Count multiple node types
-- Use case: Analyze composition of a terrain area
local function example_multiple_nodes()
	local minp = {x = 0, y = -10, z = 0}
	local maxp = {x = 20, y = 10, z = 20}
	
	-- Count several different node types
	local node_list = {
		"default:stone",
		"default:dirt",
		"default:sand",
		"air"
	}
	
	local counts = core.get_node_counts_in_area(minp, maxp, node_list)
	
	-- Display results
	print("Node composition in area:")
	for node_name, count in pairs(counts) do
		print("  " .. node_name .. ": " .. count)
	end
end

example_multiple_nodes()

@kromka-chleba kromka-chleba marked this pull request as draft February 15, 2026 14:28
Copy link
Member

@Desour Desour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a content id cache, used by abms. This PR as of now doesn't use it, but iterates over all nodes.

Exposing the content id cache is dangerous in a way that it forces all future versions of luanti to keep a cache with the exact semantics of the api.
Hence it needs to be in a way that it allows false positives, or only returns hints. Note also that right now the cache is not updated on each set_node.
We could also go as far as to manage a content_id -> count map for each block.
==> The api design is the hard part here (and finding consensus).

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 15, 2026

There is a content id cache, used by abms. This PR as of now doesn't use it, but iterates over all nodes.
Exposing the content id cache is dangerous in a way that it forces all future versions of luanti to keep a cache with the exact semantics of the api.
Hence it needs to be in a way that it allows false positives, or only returns hints. Note also that right now the cache is not updated on each set_node.

I know the cache exists but I'd like to avoid touching that for the very reasons you mentioned.
As is the function is a probably faster version of manually iterating over the VM array in Lua, but this should be good. The idea behind this is that it'd be used for 1. assigning some general labels to mapchunks after scanning them 2. once-in-a-while maintenance tasks. The modder would maintain their own cache for whatever their use case is. For example my mod would assign general labels to mapblocks, such as has_trees so the actual nodes in the mapblock are irrelevant after labeling.

We could also go as far as to manage a content_id -> count map for each block.

Well, I could make it count nodes too, but I don't feel like maintaining a cache for this - I'm probably not competent enough to implement a caching system. Though I could try if you find this necessary for this feature to get merged.

@Desour
Copy link
Member

Desour commented Feb 15, 2026

Ah, so what you want to add here is basically a new bulk node reading operation. (Like core.find_node_near.)

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 15, 2026

Ah, so what you want to add here is basically a new bulk node reading operation. (Like core.find_node_near.)

Pretty much yes.
The initial idea was to use the cached version which is saved in the database https://github.com/luanti-org/luanti/blob/master/doc/world_format.md
Though I assume that solution wouldn't be as straightforward and that there would be cache coherence problems.

EDIT: So the idea is to simply know what's in a mapblock without fetching any extra position data which would be heavy. Though counting nodes by type sounds useful - this could be used to assign priority for mapblocks (e.g. process mapblock with more tree nodes first).

@SmallJoker
Copy link
Member

SmallJoker commented Feb 15, 2026

We could also go as far as to manage a content_id -> count map for each block.

I would prefer that (id + count) over the current PR implementation (id + name) as it's more helpful for VManip (which is often used for mapblock processing). We already have a LUT for node names <--> content ID. No need to have another API that does almost the same thing.

@kromka-chleba
Copy link
Contributor Author

I would prefer that (id + count) over the current PR implementation (id + name) as it's more helpful for VManip (which is often used for mapblock processing). We already have a LUT for node names <--> content ID. No need to have another API that does almost the same thing.

Yeah, sounds reasonable. I'll do this.

@kromka-chleba kromka-chleba changed the title Add Lua API function to retrieve node-id mappings for a mapblock Add Lua API function to retrieve node counts for a mapblock Feb 15, 2026
@kromka-chleba kromka-chleba requested a review from Desour February 15, 2026 19:08
@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 15, 2026

Reworked this into core.get_node_content_counts which sounds more useful than the initial proposal.
Some parts of the generated unit tests are fishy so I'll have to improve them.

@kromka-chleba kromka-chleba marked this pull request as ready for review February 15, 2026 22:39
@kromka-chleba
Copy link
Contributor Author

I added a test command for devtest.

Co-authored-by: kromka-chleba <65868699+kromka-chleba@users.noreply.github.com>
@sfan5 sfan5 added Feature ✨ PRs that add or enhance a feature Roadmap: Needs approval The change is not part of the current roadmap and needs to be approved by coredevs beforehand @ Script API labels Feb 16, 2026
@sfan5
Copy link
Member

sfan5 commented Feb 16, 2026

How often do mods want all nodes and not just a specific set of nodes they're interested in (which is already possible with core.find_nodes_in_area)?

Another important API design point to consider is: why are we needlessly introducing a "dependency" on map blocks?
if a mod wants to operate on a 16x32x16 area it should be the engine's responsibility to figure out the most efficient way to do that.

The initial idea was to use the cached version which is saved in the database https://github.com/luanti-org/luanti/blob/master/doc/world_format.md

This data ceases to exist as soon as the block is deserialized.

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 16, 2026

How often do mods want all nodes and not just a specific set of nodes they're interested in (which is already possible with core.find_nodes_in_area)?

After rethinking we may mimic the behavior of core.find_nodes_in_area here, as with multiple node types this could become heavy.

Another important API design point to consider is: why are we needlessly introducing a "dependency" on map blocks?

This API is intended to be a low-level building block for ABM/LBM-like systems, so I picked mapblock for simplicity and performance. I could redo this to handle an arbitrary area though.

As for why not use core.find_nodes_in_area: the use case of this API would be getting statistics for node counts without the overhead of fetching positions which could get outdated with time. So the purpose of the API is scanning contents of mapblocks form time to time.

Copilot AI and others added 2 commits February 16, 2026 21:31
…ased signature

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kromka-chleba <65868699+kromka-chleba@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kromka-chleba <65868699+kromka-chleba@users.noreply.github.com>
@kromka-chleba kromka-chleba changed the title Add Lua API function to retrieve node counts for a mapblock Add Lua API function to retrieve node counts for an area Feb 16, 2026
@kromka-chleba
Copy link
Contributor Author

Honestly I find this new API more similar to core.find_nodes_in_area a bit messy, especially for the use case it was made for.

@sfan5
Copy link
Member

sfan5 commented Feb 17, 2026

This API is intended to be a low-level building block for ABM/LBM-like systems, so I picked mapblock for simplicity and performance. I could redo this to handle an arbitrary area though.

Yes, I know, but APIs should have the broadest possible usability (to a reasonable extent).

As for why not use core.find_nodes_in_area: the use case of this API would be getting statistics for node counts without the overhead of fetching positions which could get outdated with time

Then this could rather be an extension of core.find_nodes_in_area rather than a new function, no?

Honestly I find this new API more similar to core.find_nodes_in_area a bit messy, especially for the use case it was made for.

And why is that?

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 17, 2026

Then this could rather be an extension of core.find_nodes_in_area rather than a new function, no?

You mean as in returning node counts along positions (feels redundant) or returning only node counts when some argument passed to the function true? I believe the latter contradicts the rule of single responsibility and therefore decreases clarity.

And why is that?

The initial implementation of this function returned all the data we may ever need in a single call when given blockpos, this was pretty straightforward. On the other hand this implementation forces the user to first construct a list of node names or groups and also pass pos_min and pos_max, which needs a separate helper if we think about mapblocks. (more boilerplate)

If you believe this API (or the previous iterations of this API) is not useful then say so and I will close this PR, alternatively propose a better solution. I could also try measuring time for core.find_nodes_in_area vs this API and check if there's any actual gain.

@sfan5
Copy link
Member

sfan5 commented Feb 17, 2026

You mean as in returning node counts along positions

grafik

The initial implementation of this function returned all the data we may ever need [...]

The reason I asked about wanting all nodes vs. a specific set is to have the broadest possible usability if we were to add this API. More boilerplate seems obvious, but does this fit your use case or not?
And I know you mentioned an LBM/ABM system, but I'm interested in the use cases that sit on top of that.

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 17, 2026

More boilerplate seems obvious, but does this fit your use case or not?

I totally missed that grouped argument in core.find_nodes_in_area. Yes, both core.get_node_counts_in_area and core.find_nodes_in_area do fit my use case then. Using core.find_nodes_in_area over the new API should be fine provided that it isn't noticeably slower.

I'll test these to see if there's a meaningful difference.

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 18, 2026

Screenshot_2026-02-18_23-30-10

Some initial benchmarks with this command: https://github.com/kromka-chleba/minetest/blob/9e4f13fdcf9cc4260d0eb7ebc51ed4654bef3531/games/devtest/mods/benchmarks/init.lua#L352

Need do confirm this is not related to execution order and that the AI benchmark doesn't lie.

@sfan5
Copy link
Member

sfan5 commented Feb 18, 2026

Note that I highlighted the grouped=false part of the docs.

@kromka-chleba
Copy link
Contributor Author

Screenshot_2026-02-18_23-48-47

Well, I consistently get at least 40x faster execution time with this PR. This alone would make the new API better for the ABM/LBM-like systems use case. Also I believe that shoving the counting behavior to core.find_nodes_in_area is bad design (doing multiple times badly, confusing interface).

@kromka-chleba
Copy link
Contributor Author

Note that I highlighted the grouped=false part of the docs.

Right, will recheck then.

@sfan5
Copy link
Member

sfan5 commented Feb 18, 2026

I mean I'm sure you will find out that your new function is still faster, after all it is doing less work.
My point with this is: is it worth adding another API duplicating something that already exists?

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 18, 2026

I mean I'm sure you will find out that your new function is still faster, after all it is doing less work.

Still between 40-70x faster.

My point with this is: is it worth adding another API duplicating something that already exists?

I don't think 150 ms instead of 2-3 ms is acceptable just to get node counts for an area roughly the size of a mapchunk. For my use case I don't want to get node positions, I just want node counts, as fast as possible. In 150 ms we could process an area the size of a mapchunk several times with a VM. This is wasted time.

Or are you suggestting that I should rework core.find_nodes_in_area to make it faster?

@kromka-chleba
Copy link
Contributor Author

Also the interface of core.find_nodes_in_area feels messy and overloaded. I honestly had trouble parsing the documentation.

So if I understand this correctly:

If grouped == true the function returns a single table whose keys are node names and whose values are lists of positions.
Example: { ["default:dirt"] = {p1, p2, ...}, ["group:tree"] = {q1, q2, ...}, ... }

If grouped == false the function returns two values:

  1. a flat list of all matching positions (scrambled, arbitrary order?), and
  2. a table of counts keyed by node name.
    Example: pos_list, counts = {p1, p2, p3, ...}, { ["default:dirt"] = 4123, ["group:tree"] = 5, ... }

So you can either get grouped position lists (no counts provided by the API) or a flat position list plus exact counts.

This API is asymmetric: grouped == true gives grouped positions but no counts (you can compute counts afterward with #list), while grouped == false gives counts but mixes all positions together (so you lose an immediate mapping from positions to node type). That makes the two modes useful for different use cases, but awkward if you need both grouped positions and counts directly — you’d have to compute counts yourself from the grouped result or rebuild groups from the flat result.

So I could reverse the logic here: why stuff 2 use cases into a single function if there are 2 other use cases not covered by it (the above example and my use case) AND the proposed API can cover a huge chunk of this functionality in just 2 ms? Time saved by piggybacking the counting feature into the node finding feature is marginal and IMO these two should be separated and grouped == false functionality should be deprecated.

@ryvnf
Copy link
Contributor

ryvnf commented Feb 20, 2026

What are you working on that requires reimplementing custom lbm-like systems? And what would it do that does not require fetching the node positions in some other way? I agree with sfan5 that something being faster (even significantly) is enough unless it has a broad use case that is not covered by existing API functionality.

The PR could probably also use a devtest benchmarking command to measure the difference (like bulk_swap_node).

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Feb 20, 2026

What are you working on that requires reimplementing custom lbm-like systems?

I want to implement seasonal changes for Exile which requires changing soil to a dark winter variant, placing snow, etc. up to the horizon. ABMs/LBMs only work in the active block area which is smaller than the area of all loaded blocks, they're also slower because they work on single nodes, while my custom system spawns a Voxel Manipulator for the whole mapblock. They also suffer from other performance issues such as #14011
I already have a prototype of this working both with ugly hacks and with mapblock callbacks #16910

EDIT: Also until/unless #15992 gets merged, ABMs are really inflexible.

And what would it do that does not require fetching the node positions in some other way?

As mentioned above, I use VMs and need it to work for all loaded blocks. The voxel manipulator can detect positions by just iterating over the flat array. I believe passing positions to a VM could be actually slower. core.find_nodes_in_area needs 150 ms to do the job but I recall in a VM the same job can be done in 30 ms or lower.

I agree with sfan5 that something being faster (even significantly) is enough unless it has a broad use case that is not covered by existing API functionality.

The limit for ABMs is 200 ms per global step. Meanwhile with core.find_nodes_in_area we're getting data for only for a single mapchunk in ~150 ms. Scanning mapblocks as fast as possible would help me label mapblocks efficiently to process them later.

Will ABMs be changed to work for all loaded blocks? I doubt it. Are they working for my use case? No, they're slow which is why I tried implementing this #16910
Just because some people are happy with ABMs (I'd like to meet them), doesn't mean I should be held back.

TL;DR I wouldn't be wasting my time writing these PRs if what we already have was enough for my use case.

The PR could probably also use a devtest benchmarking command to measure the difference (like bulk_swap_node).

I had it on another branch but can add it here also.

EDIT: I included the benchmark, it could be removed later or split into two benchmarks.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kromka-chleba <65868699+kromka-chleba@users.noreply.github.com>
@ryvnf
Copy link
Contributor

ryvnf commented Feb 20, 2026

I want to implement seasonal changes for Exile which requires changing soil to a dark winter variant, placing snow, etc. up to the horizon. ABMs/LBMs only work in the active block area which is smaller than the area of all loaded blocks, they're also slower because they work on single nodes, while my custom system spawns a Voxel Manipulator for the whole mapblock. They also suffer from other performance issues such as #14011 I already have a prototype of this working both with ugly hacks and with mapblock callbacks #16910

I think that use case sounds reasonable. Also saw the the mapblock callback API and I agree it would be useful.

Also off-topic but I think an API to swap the node textures of clients at run time would be useful. That could for example be used for seasons by making green leaves brown ones during autumn without a performance cost. It would also work for any render distance. I have not bothered making a feature request for it because I am not working on anything that would use it. I understand your use case is broader than that (placing snow etc), but still want to throw out the idea.

@kromka-chleba
Copy link
Contributor Author

Also off-topic but I think an API to swap the node textures of clients at run time would be useful.

I considered that and sfence implemented something like this: #13811
It would be good for a lot of use cases, though the advantage of the VM-based system is that we can also change node properties such as sounds and tool groups (frosty soil being harder to dig).

@ryvnf
Copy link
Contributor

ryvnf commented Feb 20, 2026

I considered that and sfence implemented something like this: #13811 It would be good for a lot of use cases, though the advantage of the VM-based system is that we can also change node properties such as sounds and tool groups (frosty soil being harder to dig).

I was more thinking about something like core.override_node but at runtime, so you can swap out the textures and such without updating the map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature ✨ PRs that add or enhance a feature Roadmap: Needs approval The change is not part of the current roadmap and needs to be approved by coredevs beforehand @ Script API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants