-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Replace inline type IDs with global constants in LLVM IR #15485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace inline type IDs with global constants in LLVM IR #15485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great finding 🏆
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/incremental-compilation-exploration/5224/119 |
This change seems to have caused some segfaults in Athena spes: https://github.com/athena-framework/athena/actions/runs/13468376240/job/37638710966. I'm working on reducing it, but for now it can be reproduced by updating https://github.com/athena-framework/athena/blob/34ae7deb253c7f9e6ea29d7b06a1656429cdbd00/src/components/framework/spec/listeners/error_spec.cr#L57-L65 to be like: it "logs HTTPExceptions as warning", focus: true do
p ATH::Events::Exception.new ATH::Request.new("GET", "/"), MockException.new "Something went wrong"
end Then execute it via: /path/to/locally/built/compiler src/components/framework/spec/listeners/error_spec.cr Results in: #<Athena::Framework::Events::Exception:0x7ffff7100d20 @propagation_stopped=false, @request=#<Athena::Framework::Request:0x7ffff70d4d80 @action=nil, @attributes=Athena::Framework::ParameterBag(@parameters={}), @request_data=nil, @request_format=nil, @request=#<HTTP::Request:0x7ffff70d3ee0 @method="GET", @headers=HTTP::Headers{}, @body=nil, @version="HTTP/1.1", @cookies=nil, @query_params=nil, @form_params=nil, @uri=nil, @remote_address=nil, @local_address=nil, @resource="/">, @is_host_valid=true, @is_forwarded_valid=true, @scheme="http", @trusted_values_cache={}>, @response=nil, @exception=#<Invalid memory access (signal 11) at address 0x0
[0x5555559e797a] *Exception::CallStack::print_backtrace:Nil +122 in ./coverage/bin/framework
[0x555555962743] ~procProc(Int32, Pointer(LibC::SiginfoT), Pointer(Void), Nil) +275 in ./c overage/bin/framework
[0x7ffff715d1d0] ?? +140737338790352 in /usr/lib/libc.so.6
[0x555555a6abd3] *Slice(T)::new:read_only<Pointer(UInt8), Int32, Bool>:Slice(UInt8) +35 in ./coverage/bin/framework
[0x5555559e86a5] *String#to_slice:Slice(UInt8) +37 in ./coverage/bin/framework
[0x5555559ea9f8] *String#to_s<String::Builder>:Nil +24 in ./coverage/bin/framework
[0x555555a59efb] *String::Builder +27 in ./coverage/bin/framework
[0x555555b615e2] *Exception+ +34 in ./coverage/bin/framework
[0x555555b636da] *Exception+ +74 in ./coverage/bin/framework
[0x555556333adc] ?? +93825006779100 in ./coverage/bin/framework I think the issue has to do with me defining |
…g#15485) Introduction and removal of types shifts the assignment of type IDs. They are numerical IDs in sequential order, with types defined later in the source code or the compiler receiving larger IDs than their sibling types. Type IDs used to be inlined into the LLVM module, which can invalidate large portions of the object cache due to the change of type IDs. This patch does not fight against the type ID assignment. It merely stops the inlining: ```llvm @"Foo0:type_id" = external constant i32 ; Function Attrs: uwtable define ptr @"*Foo0@Reference::new:Foo0"() #0 { ; ... %1 = getelementptr inbounds %Foo0, ptr %0, i32 0, i32 0 %2 = load i32, ptr @"Foo0:type_id", align 4 store i32 %2, ptr %1, align 4 ; ... } ``` Global variables are required to be addressable in LLVM, so this creates an actual constant in the read-only section, hence the extra `load`. The actual compile-time value is now defined in `_main`: ```llvm @"Foo0:type_id" = constant i32 6 ``` With this simple trick, the object file cache can be reused more often. This will hopefully improve build times in certain scenarios, such as rapid prototyping, and IDE integrations that run the whole compiler.
The compiler places the string contents of all symbols defined in the source code into a special `:symbol_table` LLVM global variable in the main LLVM module: ```llvm @":symbol_table" = global [24 x ptr] [ ptr @"'general'", ptr @"'no_error'", ptr @"'gc'", ptr @"'sequentially_consis...'", ptr @"'monotonic'", ptr @"'acquire'", ptr @"'evloop'", ptr @"'xchg'", ptr @"'release'", ptr @"'acquire_release'", ptr @"'io_write'", ptr @"'sched'", ptr @"'io_read'", ptr @"'skip'", ptr @"'none'", ptr @"'active'", ptr @"'done'", ptr @"'unchecked'", ptr @"'sleep'", ptr @"'relaxed'", ptr @"'add'", ptr @"'sub'", ptr @"'file'", ptr @"'target'" ] ``` Every LLVM module, without exception, also declares it: ```llvm @":symbol_table" = external global [24 x ptr] ``` Since the total number of symbols is part of the variable type, adding or removing any symbol invalidates all object files in the Crystal cache, even though only `Symbol#to_s`, or rather `@[Primitive(:symbol_to_s)]`, has access to this variable. This PR removes this declaration except where the primitive accesses it. Changes to the symbol table will only affect LLVM modules that call `Symbol#to_s` (the primitive body itself is inlined). Building an empty file, replacing it with `x = :abcde`, then rebuilding the same file will now give: ``` Codegen (bc+obj): - 315/317 .o files were reused These modules were not reused: - _main (_main.o0.bc) - Crystal (C-rystal.o0.bc) ``` (`Crystal` is also invalidated because the return type of `Crystal.main` is now `Symbol` instead of `Nil`; appending `nil` to the file makes this module reusable.) Note that it is still possible to invalidate large portions of the cache from the addition or removal of symbols, because their indices are allocated sequentially, and inlined on every use. Something like #15485 but for `Symbol` values would solve that.
…ng#15486) The compiler places the string contents of all symbols defined in the source code into a special `:symbol_table` LLVM global variable in the main LLVM module: ```llvm @":symbol_table" = global [24 x ptr] [ ptr @"'general'", ptr @"'no_error'", ptr @"'gc'", ptr @"'sequentially_consis...'", ptr @"'monotonic'", ptr @"'acquire'", ptr @"'evloop'", ptr @"'xchg'", ptr @"'release'", ptr @"'acquire_release'", ptr @"'io_write'", ptr @"'sched'", ptr @"'io_read'", ptr @"'skip'", ptr @"'none'", ptr @"'active'", ptr @"'done'", ptr @"'unchecked'", ptr @"'sleep'", ptr @"'relaxed'", ptr @"'add'", ptr @"'sub'", ptr @"'file'", ptr @"'target'" ] ``` Every LLVM module, without exception, also declares it: ```llvm @":symbol_table" = external global [24 x ptr] ``` Since the total number of symbols is part of the variable type, adding or removing any symbol invalidates all object files in the Crystal cache, even though only `Symbol#to_s`, or rather `@[Primitive(:symbol_to_s)]`, has access to this variable. This PR removes this declaration except where the primitive accesses it. Changes to the symbol table will only affect LLVM modules that call `Symbol#to_s` (the primitive body itself is inlined). Building an empty file, replacing it with `x = :abcde`, then rebuilding the same file will now give: ``` Codegen (bc+obj): - 315/317 .o files were reused These modules were not reused: - _main (_main.o0.bc) - Crystal (C-rystal.o0.bc) ``` (`Crystal` is also invalidated because the return type of `Crystal.main` is now `Symbol` instead of `Nil`; appending `nil` to the file makes this module reusable.) Note that it is still possible to invalidate large portions of the cache from the addition or removal of symbols, because their indices are allocated sequentially, and inlined on every use. Something like crystal-lang#15485 but for `Symbol` values would solve that.
The compile time optimization to extract type IDs into global constants (#15485) means that types with the same name are now merged into one. This revealed a bug where two distinct types could share the same name and now would have the same ID and be considered a single type.
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/crystal-1-16-2-is-released/8013/8 |
Consider the following snippet:
We are going to compile this twice, first with a cold cache, then with
-Dbar
:These times suggest that the addition of
Bar
completely invalidates the object cache, and indeed, if you pass also--stats
to the second compilation, it would sayno previous .o files were reused
. How could this be the case when none of theFoo
s depend onBar
?This counterintuitive behavior arises from the way Crystal separates LLVM IR into LLVM modules. Each non-generic type or generic instance has its own LLVM module containing all of that type or instance's class and instance methods, and then the rest goes to a special LLVM module called
_main
. The bytecode forFoo0
can be disassembled back to LLVM IR usingllvm-dis
:The line
store i32 7, ptr %1, align 4
is whereFoo0.allocate
storesFoo0.crystal_type_id
to the newly allocated memory area; this7
corresponds to the compile-time value ofFoo0.crystal_type_id
. If we drop-Dbar
again, the same value now becomes6
.The compiler component responsible for generating these type IDs is the
Crystal::LLVMId
class; it assigns numerical IDs in sequential order, with types defined later in the source code or the compiler receiving larger IDs than their sibling types. In particular, all structs have larger type IDs than every class. (You can see this information by setting the environment variableCRYSTAL_DUMP_TYPE_ID
to 1 during compilation.) Hence, by definingBar
at the beginning of the file, we have incremented the type ID of every singleFoo
by 1, and the inlining breaks the cache.If we move
Bar
to the bottom of the file, then recompilations will be able to reuse theFoo
object files, because their type IDs remain untouched. In practice, however, the splitting of source code into separate files renders this specific workaround nearly impossible to pull off, not to mention that other constructs liketypeof
andis_a?
also inline type IDs, apart fromReference.new
. In short, if your code tries to remove theNil
from anInt32?
, its cache will get invalidated any time you add or remove a class.This PR does not fight against the type ID assignment. It merely stops the inlining:
Global variables are required to be addressable in LLVM, so this creates an actual constant in the read-only section, hence the extra
load
. The actual compile-time value is now defined in_main
:With this simple trick, the object file cache is now working as intended:
As another example, we compile an empty file with the standard prelude, then add
class Foo; end; Foo.new
to it and recompile. These are the times:For an even larger codebase, we try this modification in
src/compiler/crystal/codegen/codegen.cr
of the Crystal compiler itself:The times are:
This will hopefully improve build times in certain scenarios, such as rapid prototyping, and IDE integrations that run the whole compiler.