-
-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Description
Godot has longer compile times than we'd like. According to previous analyses, the main cause of this is parsing of header files.
See https://benchmarks.godotengine.org/graph/build-time/ for progress on this task.
Workaround
If you're just looking to decrease your personal compile time during development, use scu_build=yes with SCons:
scons scu_build=yesIf you're looking to help decrease the compile time for everyone, keep reading.
Methodology
To most effectively reduce the amount of cross includes between our headers, we need metrics. I came up with two:
- Fresh compile cost: An estimation of the from-scratch compile time caused the header. It is calculated from adding self time across all compile units' time traces, and partial responsibility over includees.
- Optimizing this is useful for all contributors, and anyone else compiling the engine.
- Recompile cost: An estimation of the potential of each file to cause recompiles. It is calculated from self time across all compile units' time traces, weighed by its size (to estimate churn potential), and partial responsibility over includees.
- Optimizing this is useful for regular contributors and CI.
Both are mainly based on clang --ftime-trace and include trees (while the second one also uses file size to estimate churn potential). I've consolidated my efforts into a public tool:
https://github.com/Ivorforce/clang-project-profiler
I also used betweenness centrality in an earlier version. You can find the source code for that here:
measure_includes_centrality.py
#!/usr/bin/env python3
import io
import json
import subprocess
import os
import sys
import pathlib
import shlex
from collections import deque, defaultdict
import concurrent.futures
import multiprocessing
def graph_from_compile_entry(idx: int, entry: dict):
directory = entry.get("directory", os.getcwd())
command = entry.get("command")
arguments = entry.get("arguments")
# Prefer 'arguments' if present, otherwise split 'command'
if arguments:
cmd = arguments
else:
cmd = shlex.split(command)
if idx % 10 == 0:
print(f"{idx:04d}", cmd[2])
cmd = [cmd[0]] + cmd[3:] + ["--trace-includes"]
result = subprocess.run(cmd, cwd=directory, check=True, capture_output=True, text=True)
# Not sure why but it prints to stderr
graph = parse_graph(result.stderr)
return graph
def parse_graph(input_str: str) -> dict[str, set[str]]:
graph: dict[str, set[str]] = defaultdict(set)
stack: list[tuple[int, str]] = []
for line in input_str.strip().splitlines():
depth = len(line) - len(line.lstrip('.'))
node = line[depth:].strip()
# Adjust the stack to the current depth
while stack and stack[-1][0] >= depth:
stack.pop()
# If there is a parent node, link it
if stack:
graph[stack[-1][1]].add(node)
stack.append((depth, node))
return dict(graph)
def betweenness_centrality(graph: dict[str, set[str]], directed: bool = False) -> dict[str, float]:
# Include isolated nodes
all_nodes = set(graph.keys()) | {n for nbrs in graph.values() for n in nbrs}
graph = {n: graph.get(n, set()) for n in all_nodes}
centrality = dict.fromkeys(graph.keys(), 0.0)
for s in graph:
stack = []
predecessors = {v: [] for v in graph}
sigma = dict.fromkeys(graph, 0.0)
sigma[s] = 1.0
distance = dict.fromkeys(graph, -1)
distance[s] = 0
queue = deque([s])
while queue:
v = queue.popleft()
stack.append(v)
for w in graph[v]:
if distance[w] < 0:
queue.append(w)
distance[w] = distance[v] + 1
if distance[w] == distance[v] + 1:
sigma[w] += sigma[v]
predecessors[w].append(v)
delta = dict.fromkeys(graph, 0.0)
while stack:
w = stack.pop()
for v in predecessors[w]:
delta[v] += (sigma[v] / sigma[w]) * (1.0 + delta[w])
if w != s:
centrality[w] += delta[w]
# Normalise
n = len(graph)
if n > 2:
scale = 1.0 / ((n - 1) * (n - 2))
if not directed:
scale *= 0.5
for v in centrality:
centrality[v] *= scale
return centrality
def main():
# Default filename
filename = "compile_commands.json"
if len(sys.argv) > 1:
filename = sys.argv[1]
# Read the compile_commands.json file
try:
with open(filename, "r") as f:
compile_commands = json.load(f)
except Exception as e:
print(f"Error reading {filename}: {e}")
sys.exit(1)
print(f"Starting {len(compile_commands)} commands...")
all_graph: dict[str, set[str]] = {}
failure_count = 0
executor = concurrent.futures.ProcessPoolExecutor(multiprocessing.cpu_count())
futures = [executor.submit(graph_from_compile_entry, *item) for item in enumerate(compile_commands)]
completed, not_completed = concurrent.futures.wait(futures)
for future in completed:
try:
graph = future.result()
for filename, targets in graph.items():
prev_targets: set[str] = all_graph.setdefault(filename, set())
prev_targets.update(targets)
except:
failure_count += 1
centrality = betweenness_centrality(all_graph)
centrality = { key: value for key, value in centrality.items() if key.startswith("./") and value > 0 }
centrality_list = list(centrality.items())
centrality_list.sort(key=lambda kv: kv[1], reverse=True)
final_string = "\n".join(f"{kv[0]}: {kv[1]}" for kv in centrality_list)
print(f"Done. {failure_count} object files failed to analyze.")
result_path = pathlib.Path("./betweenness-centrality.txt")
result_path.write_text(final_string)
print(result_path.absolute())
if __name__ == "__main__":
main()Tracker
Below is the top 100 list of headers that should be investigated..
Please remember that not all of these files can be 'fixed'. Also, note that this list was gathered on macOS. On other systems, libcpp entries will evaluate differently.
last update: 2026-02-21
Fresh compile cost
core/object/ref_counted.h: 176s
core/object/object.h: 164s
core/variant/variant.h: 163s
core/io/resource.h: 160s
core/object/class_db.h: 135s
core/variant/binder_common.h: 103s
scene/gui/control.h: 77s
scene/main/node.h: 62s
core/string/ustring.h: 60s
core/object/gdvirtual.gen.h: 59s
scene/main/canvas_item.h: 59s
core/variant/type_info.h: 56s
core/templates/safe_refcount.h: 56s
servers/rendering/rendering_server.h: 56s
core/typedefs.h: 53s
editor/plugins/editor_plugin.h: 52s
core/object/method_bind.h: 52s
tests/test_macros.h: 49s
<libcpp>/atomic: 47s
core/math/math_funcs.h: 47s
core/io/resource_loader.h: 45s
core/templates/hashfuncs.h: 45s
scene/resources/texture.h: 44s
core/templates/cowdata.h: 44s
core/io/resource_uid.h: 43s
core/io/image.h: 43s
scene/3d/node_3d.h: 41s
core/object/worker_thread_pool.h: 40s
<libcpp>/__memory/shared_ptr.h: 39s
scene/resources/mesh.h: 39s
core/object/callable_method_pointer.h: 38s
editor/editor_node.h: 38s
core/templates/vector.h: 35s
scene/3d/camera_3d.h: 35s
<libcpp>/condition_variable: 34s
thirdparty/doctest/doctest.h: 34s
core/os/mutex.h: 34s
<libcpp>/mutex: 34s
scene/resources/material.h: 33s
core/os/thread_safe.h: 33s
scene/resources/theme.h: 32s
servers/display/display_server.h: 32s
scene/gui/container.h: 31s
<libcpp>/functional: 31s
<libcpp>/string: 31s
<libcpp>/cmath: 31s
core/object/script_language.h: 30s
core/os/os.h: 29s
core/variant/variant_internal.h: 29s
core/input/input_event.h: 28sRecompile cost
core/variant/variant.h: 248s
core/object/object.h: 215s
core/object/ref_counted.h: 215s
core/object/class_db.h: 211s
core/io/resource.h: 190s
core/variant/binder_common.h: 154s
scene/gui/control.h: 100s
servers/rendering/rendering_server.h: 98s
scene/main/canvas_item.h: 82s
core/object/method_bind.h: 77s
core/variant/type_info.h: 74s
core/object/gdvirtual.gen.h: 74s
scene/main/node.h: 70s
core/string/ustring.h: 69s
core/object/callable_method_pointer.h: 63s
servers/display/display_server.h: 57s
core/io/image.h: 54s
core/io/resource_uid.h: 50s
core/variant/variant_internal.h: 49s
core/os/os.h: 47s
scene/3d/node_3d.h: 45s
editor/plugins/editor_plugin.h: 42s
scene/gui/container.h: 39s
scene/resources/texture.h: 39s
scene/resources/theme.h: 37s
editor/editor_node.h: 37s
core/math/vector3.h: 36s
scene/resources/mesh.h: 34s
core/templates/vector.h: 34s
core/templates/hashfuncs.h: 33s
scene/3d/camera_3d.h: 32s
core/variant/typed_array.h: 32s
core/string/string_name.h: 32s
servers/text/text_server.h: 29s
tests/test_macros.h: 29s
core/variant/method_ptrcall.h: 29s
core/math/transform_3d.h: 29s
core/math/math_funcs.h: 28s
core/templates/cowdata.h: 28s
core/io/logger.h: 27s
core/object/script_instance.h: 27s
scene/resources/font.h: 26s
core/input/input_event.h: 25s
scene/resources/material.h: 25s
core/variant/dictionary.h: 24s
scene/2d/node_2d.h: 24s
core/error/error_macros.h: 24s
core/math/face3.h: 24s
core/templates/hash_map.h: 24s
core/io/resource_loader.h: 23sHow to contribute
You can contribute by picking out any of the above files, and investigating its contents, includes, and includers. Some ideas:
- Can any include be removed? Prioritize includees that also appear on the list, otherwise your change might not be impactful.
- Can some includers be changed not to include this file? Prioritize includers that also appear on the list, otherwise your change might not be impactful.
- Can this file be split into independent headers with less contents or includes?
If you believe there's nothing that can be improved about a header, please comment this on this tracker.
If you manage to remove the include, recompile the engine. It is likely that some files will error on compile, because they're missing includes. In small numbers, this is expected. However, if too many files fail to compile, it may be time to reconsider: This may mean that the include was logical after all, and most files that includes the header you were trying to improve also needs to include the other include you removed.
In some cases, you may want to guard against regressions. As a guideline, you should guard against a regression if it is illogical for the header to need to include the file / type (and you expect it to happen by accident). To do this, use STATIC_ASSERT_INCOMPLETE_TYPE in the associated .cpp file of the header. You can find some examples of how it's done in the repository.
Lines 453 to 458 in 6d33ad2
| /// Enforces the requirement that a class is not fully defined. | |
| /// This can be used to reduce include coupling and keep compile times low. | |
| /// The check must be made at the top of the corresponding .cpp file of a header. | |
| #define STATIC_ASSERT_INCOMPLETE_TYPE(m_keyword, m_type) \ | |
| m_keyword m_type; \ | |
| static_assert(!is_fully_defined_v<m_type>, #m_type " was unexpectedly fully defined. Please check the include hierarchy of '" __FILE__ "' and remove includes that resolve the " #m_keyword "."); |
Finally, if you feel comfortable with your changes, submit a PR, and link back to this tracker. With enough of these kinds of PRs, hopefully we can decrease Godot's compile time to a minimum.
Metadata
Metadata
Assignees
Type
Projects
Status
Status