-
Notifications
You must be signed in to change notification settings - Fork 56
feat: add support for LLVM IR files in the linker #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
@tamird reviewed 4 of 5 files at r1, all commit messages.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod)
-- commits line 2 at r1:
it would be great to include your motivation here
src/llvm/mod.rs line 144 at r1 (raw file):
linked } #[must_use]
newline between functions please
src/llvm/mod.rs line 148 at r1 (raw file):
context: &'ctx LLVMContext, module: &mut LLVMModule<'ctx>, buffer: &CStr,
why does this need a c-string? LLVMCreateMemoryBufferWithMemoryRange takes a pointer and a length, i think this can be &[u8]
src/llvm/mod.rs line 176 at r1 (raw file):
linked = unsafe { LLVMLinkModules2(module.as_mut_ptr(), temp_module) } == 0; } else { if !error_msg.is_null() {
we should return the error message.
src/llvm/mod.rs line 179 at r1 (raw file):
unsafe { LLVMDisposeMessage(error_msg) }; } if !temp_module.is_null() {
this should be impossible, no?
Cargo.toml line 43 at r1 (raw file):
rustc-build-sysroot = { workspace = true } which = { version = "8.0.0", default-features = false, features = ["real-sys", "regex"] } tempfile = "3.13"
could you keep this alphabetical please and use { version = ... } for consistency?
src/linker.rs line 514 at r1 (raw file):
// buffer used to perform file type detection let mut buf = [0u8; 1024];
could you add a comment here, or maybe make this more general so that this arbitrary size isn't needed? e.g. i wonder if BufRead would help?
src/linker.rs line 600 at r1 (raw file):
} InputType::Ir => { data.push(0); // force push null terminator
I think you can just use CString::new(data) which will internally append the nul byte.
src/linker.rs line 914 at r1 (raw file):
Some(position) => &data[position..], None => return false, };
Code quote:
// Trim whitespace from the start of the data
let trimmed = match data.iter().position(|b| !b.is_ascii_whitespace()) {
Some(position) => &data[position..],
None => return false,
};src/linker.rs line 923 at r1 (raw file):
|| trimmed.starts_with(b"target ") || trimmed.starts_with(b"define") || trimmed.starts_with(b"!llvm")
might be easier to read as [.....].iter().any(|prefix| trimmed.starts_with(prefix))
Code quote:
trimmed.starts_with(b"; ModuleID")
|| trimmed.starts_with(b"target triple")
|| trimmed.starts_with(b"target datalayout")
|| trimmed.starts_with(b"source_filename")
|| trimmed.starts_with(b"target ")
|| trimmed.starts_with(b"define")
|| trimmed.starts_with(b"!llvm")
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 514 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
could you add a comment here, or maybe make this more general so that this arbitrary size isn't needed? e.g. i wonder if BufRead would help?
What you suggest as a comment for this one?
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 10 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 514 at r1 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
What you suggest as a comment for this one?
I would avoid the arbitrary-size buffer if possible - in the case of IR you could have unbounded whitespace before the thing you look for in is_llvm_ir.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed 1 of 4 files at r2, all commit messages.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 915 at r2 (raw file):
fn is_llvm_ir(data: &[u8]) -> bool { let trimmed = data.trim_ascii_start(); if trimmed.is_empty() {
this check is not needed, right?
src/llvm/mod.rs line 152 at r2 (raw file):
let buffer_name = c"ir_buffer"; let buffer = buffer.to_bytes(); let mem_buffer = unsafe {
don't you need to unsafe { LLVMDisposeMemoryBuffer(buffer) }; like the function above?
src/llvm/mod.rs line 157 at r2 (raw file):
buffer.len(), buffer_name.as_ptr(), 1, // LLVM internally sets RequiresTerminator=true
I am now confused by this. Can we just set it to 0 and then we wouldn't need to create a C-string?
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 915 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
this check is not needed, right?
Now that we switched to the approach of no hardcoded buffer I need to remove this, sorry
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
don't you need to
unsafe { LLVMDisposeMemoryBuffer(buffer) };like the function above?
On some tests I was doing this but, if we drop the buffer here it can't hold the internal reference to it and perform the checks, but this was with the previous approach, I need to check now!
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I am now confused by this. Can we just set it to 0 and then we wouldn't need to create a C-string?
So because of the llvm ir parser we need to set this to one because of that hardcoded value setting null termination to true even if the buffer don't have it... This was a issue alessandro found it with llvm in debug mode, but maybe he can answer this better
cc @alessandrod
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
So because of the llvm ir parser we need to set this to one because of that hardcoded value setting null termination to true even if the buffer don't have it... This was a issue alessandro found it with llvm in debug mode, but maybe he can answer this better
cc @alessandrod
Now that I am looking at the screenshot he posted, I think that was just because of this 1. If you change it to 0 I think you do not need a null terminator.
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 11 unresolved discussions (waiting on @alessandrod and @tamird)
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Now that I am looking at the screenshot he posted, I think that was just because of this
1. If you change it to0I think you do not need a null terminator.
That print was testing with 0
|
Done, removed the unnecessary check and also tested with |
|
hey @tamird gm! any blockers here ? |
|
@tamird wrt the memory buffer thing, I think we actually need to change all the other instances to be null terminated The RequiresNullTerminator field isn't stored anywhere, it's only used to do a check when the buffer is created https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/lib/Support/MemoryBuffer.cpp#L48 Then this is why we get the assertion in the screenshot: LLVMParseIRInContext => parseIR => parseAssembly => parseAssemblyInto
so RequiresNullTerminator=true and we hit the assertion if we don't null terminate our buffer I think in the other instances we're lucky and we're not hitting the assertion by accident |
src/linker.rs
Outdated
| let mut buf = BufReader::new(input); | ||
|
|
||
| // Peek at the buffer to determine file type | ||
| let preview = buf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is still an arbitrary size buffer (the size of the internal BufReader
buffer which is 4096 by default)
You don't need BufReader here. What you want to do is pass input to
detect_input_type instead of passing &[u8]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, just saw on the implementation
"BufReader can improve the speed of programs that make small and repeated read calls to the same file or network socket. It does not help when reading very large amounts at once, or reading just one or a few times. It also provides no advantage when reading from a source that is already in memory, like a Vec"
| ) -> Result<bool, String> { | ||
| let buffer_name = c"ir_buffer"; | ||
| let buffer = buffer.to_bytes(); | ||
| let mem_buffer = unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're leaking this you need to call LLVMDisposeMemoryBuffer before returning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tests/ir_file_test.rs
Outdated
|
|
||
| // Corrupting IR content | ||
| let invalid_content = valid_content | ||
| .replace("define", "defXne") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to corrupt 3 things, pick one :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed 1 of 3 files at r4, all commit messages.
Reviewable status: 3 of 5 files reviewed, 16 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
On some tests I was doing this but, if we drop the buffer here it can't hold the internal reference to it and perform the checks, but this was with the previous approach, I need to check now!
as @alessandrod said below, you are leaking here
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
That print was testing with 0
Could you please add a citation? i.e. please link to the relevant LLVM code.
src/linker.rs line 885 at r5 (raw file):
} fn detect_input_type(reader: &mut (impl Read + Seek)) -> Option<InputType> {
i think you don't need the parenthesis around Read + Seek
src/linker.rs line 887 at r5 (raw file):
fn detect_input_type(reader: &mut (impl Read + Seek)) -> Option<InputType> { let mut header = [0u8; 8]; let bytes_read = reader.read(&mut header).ok()?;
please don't use result.ok -- this discards information. why is it ok to throw away this error?
src/linker.rs line 898 at r5 (raw file):
b"\xcf\xfa\xed\xfe" => Some(InputType::MachO), _ => { reader.rewind().ok()?;
why is it ok to toss this error, and why is it ok to rewind the reader only here? in other words, the contract of this function is unclear; what will be the state of the reader when it returns?
src/linker.rs line 924 at r5 (raw file):
loop { let bytes_read = reader.read(&mut byte).ok()?;
reading one byte at a time is wildly inefficient
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 16 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 898 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
why is it ok to toss this error, and why is it ok to rewind the reader only here? in other words, the contract of this function is unclear; what will be the state of the reader when it returns?
You're right, I will refactor and self contain this logic inside of is_llvm_ir since it's for the specific funtion.
src/linker.rs line 924 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
reading one byte at a time is wildly inefficient
Working on a improvement
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
as @alessandrod said below, you are leaking here
Actually I'm not sure. If I add the unsafe { LLVMDisposeMemoryBuffer(mem_buffer) }; on the success path before returning Ok(linked) the test panic's with a SIGSEGV error, probably meaning that we force the drop of the memory that it will perform the assertion. And since LLVMParseIRInContext takes ownership of the buffer, it should resolve the drop by itself.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 16 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Actually I'm not sure. If I add the
unsafe { LLVMDisposeMemoryBuffer(mem_buffer) };on the success path before returningOk(linked)the test panic's with a SIGSEGV error, probably meaning that we force the drop of the memory that it will perform the assertion. And sinceLLVMParseIRInContexttakes ownership of the buffer, it should resolve the drop by itself.
I think you are right which is utterly crazy because LLVMParseBitCodeInContext2 does not take ownership though LLVMParseBitCodeInContext does!
Please add a comment on the call to LLVMParseIRInContext to explain this madness.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BretasArthur1 please address all the comments before requesting re-review. There are many outstanding comments.
@tamird reviewed all commit messages.
Reviewable status: 2 of 5 files reviewed, 17 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 598 at r6 (raw file):
let data = CString::new(data).unwrap(); if !llvm::link_ir_buffer(context, module, &data) .map_err(|_| LinkerError::LinkModuleError(path.to_owned()))?
why throw away the error here?
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about it @tamird, besides the most recent one, which one I missed?
Reviewable status: 2 of 5 files reviewed, 17 unresolved discussions (waiting on @alessandrod and @tamird)
Previously, tamird (Tamir Duberstein) wrote…
it would be great to include your motivation here
Done.
Cargo.toml line 43 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
could you keep this alphabetical please and use
{ version = ... }for consistency?
Done.
src/linker.rs line 600 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I think you can just use
CString::new(data)which will internally append the nul byte.
Done.
src/linker.rs line 914 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Done.
src/linker.rs line 923 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
might be easier to read as
[.....].iter().any(|prefix| trimmed.starts_with(prefix))
Done.
src/linker.rs line 885 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
i think you don't need the parenthesis around Read + Seek
You mean, creating a generic with bounds?
src/linker.rs line 887 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
please don't use result.ok -- this discards information. why is it ok to throw away this error?
Done.
src/linker.rs line 924 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Working on a improvement
Done.
src/llvm/mod.rs line 144 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
newline between functions please
Done.
src/llvm/mod.rs line 148 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
why does this need a c-string? LLVMCreateMemoryBufferWithMemoryRange takes a pointer and a length, i think this can be &[u8]
Done.
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Could you please add a citation? i.e. please link to the relevant LLVM code.
Done.
| ) -> Result<bool, String> { | ||
| let buffer_name = c"ir_buffer"; | ||
| let buffer = buffer.to_bytes(); | ||
| let mem_buffer = unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tests/ir_file_test.rs
Outdated
|
|
||
| // Corrupting IR content | ||
| let invalid_content = valid_content | ||
| .replace("define", "defXne") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 12 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
Previously, BretasArthur1 (Arthur Bretas) wrote…
Done.
Not done. Please squash to a single commit and include why, not just what.
src/linker.rs line 885 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
You mean, creating a generic with bounds?
I mean you can write &mut impl Read + Seek, I think?
src/linker.rs line 887 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Done.
huh? not done.
src/linker.rs line 924 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Done.
Why not use a BufReader here? Seems like you've reinvented it here. Reading a byte at a time from a BufReader is fine because it buffers internally. This function can be much simpler, I think.
src/llvm/mod.rs line 148 at r1 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Done.
The comment explaining that LLVM has a hard-coded RequiresNullTerminator=1 (along with a citation) should be here on the function to explain the API.
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Done.
This comment is written pretty strangely and also is anchored in the wrong place - it doesn't explain this 1, it explains why this function takes a CStr.
src/linker.rs line 927 at r6 (raw file):
.rewind() .inspect_err(|e| error!("failed to rewind reader for LLVM IR detection: {}", e)) .ok()?;
same problem here, this is discarding the error. please avoid random error prints
|
idk why, but my reviewable wasn't showing some comments, sorry abt it |
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 12 unresolved discussions (waiting on @alessandrod and @tamird)
Previously, tamird (Tamir Duberstein) wrote…
Not done. Please squash to a single commit and include why, not just what.
Ok, will be doing before the merge
src/linker.rs line 885 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I mean you can write
&mut impl Read + Seek, I think?
we actually do need... without the parenthesis we hit ambiguous + in a type
src/linker.rs line 887 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
huh? not done.
fixing it
src/linker.rs line 924 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Why not use a BufReader here? Seems like you've reinvented it here. Reading a byte at a time from a BufReader is fine because it buffers internally. This function can be much simpler, I think.
BufReader haves that internal size limit, I was trying to avoid it. I opt to a approach were we read all the file until we found a non-whitespace character... I can use the read_to_end that may cause some overhead, but I can also use the BufReader here, but maybe it would fail if we just check the 4096 and we have more then that of whitespaces... wdyt?
src/linker.rs line 598 at r6 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
why throw away the error here?
fixing it, I'll create a new error type returning the path and the error message.
src/linker.rs line 927 at r6 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
same problem here, this is discarding the error. please avoid random error prints
I will need to change both signatures to unwind the error... I'll work on it
src/llvm/mod.rs line 148 at r1 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
The comment explaining that LLVM has a hard-coded RequiresNullTerminator=1 (along with a citation) should be here on the function to explain the API.
now, done
src/llvm/mod.rs line 152 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I think you are right which is utterly crazy because
LLVMParseBitCodeInContext2does not take ownership thoughLLVMParseBitCodeInContextdoes!Please add a comment on the call to
LLVMParseIRInContextto explain this madness.
Done.
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
This comment is written pretty strangely and also is anchored in the wrong place - it doesn't explain this
1, it explains why this function takes a CStr.
Oh ok, addressing the fix
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI is red.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed all commit messages.
@tamird dismissed @alessandrod from a discussion.
Reviewable status: 2 of 5 files reviewed, 8 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
Previously, BretasArthur1 (Arthur Bretas) wrote…
Ok, will be doing before the merge
Do it now?
src/linker.rs line 924 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
BufReader haves that internal size limit, I was trying to avoid it. I opt to a approach were we read all the file until we found a non-whitespace character... I can use the read_to_end that may cause some overhead, but I can also use the BufReader here, but maybe it would fail if we just check the 4096 and we have more then that of whitespaces... wdyt?
What internal limit? It's not a limit, it will read more data as needed. Either I've not understood you or you've not understood BufReader.
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
Oh ok, addressing the fix
Please remove this comment and set this to zero. There is absolutely no reason to pass 1 here.
src/linker.rs line 48 at r7 (raw file):
/// Linking an IR module failed. #[error("failure linking IR module `{0}`: {1}")] LinkIrModuleError(PathBuf, String),
This is a failure to parse IR, not to link it.
I am getting the feeling that this work is very rushed. Please, take the necessary time to review your own work.
src/linker.rs line 928 at r7 (raw file):
]; const MAX_PREFIX_LEN: usize = 17; // "target datalayout"
just make the string the constant and use .len()
src/llvm/mod.rs line 149 at r7 (raw file):
/// The buffer must be null-terminated (hence `&CStr`), because LLVM's IR parser /// requires `RequiresNullTerminator=1` when creating the memory buffer. /// See: https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/lib/Support/MemoryBuffer.cpp#L48
This is irrelevant? The only thing that matters is the link below to the header which has a default parameter = true and the actual call to that function which doesn't pass that parameter (you need to find that call and cite it here)
Code quote:
/// requires `RequiresNullTerminator=1` when creating the memory buffer.
/// See: https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/lib/Support/MemoryBuffer.cpp#L48|
Previously, tamird (Tamir Duberstein) wrote…
Ok, you're right. Let me took the much as needed to review this, sorry about the back and forth |
|
Previously, tamird (Tamir Duberstein) wrote…
I'll use it. but what I was talking don't apply here since we already have the buffer and can simply use BufReader to look into it |
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 8 unresolved discussions (waiting on @alessandrod and @tamird)
src/linker.rs line 928 at r7 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
just make the string the constant and use .len()
I'll remove this logic
src/llvm/mod.rs line 157 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
Please remove this comment and set this to zero. There is absolutely no reason to pass 1 here.
Done.
src/llvm/mod.rs line 149 at r7 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
This is irrelevant? The only thing that matters is the link below to the header which has a default parameter = true and the actual call to that function which doesn't pass that parameter (you need to find that call and cite it here)
Done.
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed all commit messages.
Reviewable status: 2 of 5 files reviewed, 9 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/linker.rs line 924 at r5 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
I'll use it. but what I was talking don't apply here since we already have the buffer and can simply use BufReader to look into it
I sent #329 to just make this whole thing simpler.
-- commits line 5 at r8:
s/it/their/
s/a extra/an extra/
s/on`/through `/
-- commits line 7 at r8:
s/Rust/rustc/
-- commits line 17 at r8:
drop this, it's just repeating code and useless without looking at the code.
Code quote:
Changes:
- Add InputType::Ir variant for .ll files
- Implement is_llvm_ir() to detect IR files by common prefixes
(; ModuleID, source_filename, target datalayout, etc.)
- Add link_ir_buffer() to parse and link IR modules using
LLVMParseIRInContext
- Add IRParseError for better error handling when IR parsing fails
- Update detect_input_type() to handle IR detection as fallback
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 9 unresolved discussions (waiting on @alessandrod and @tamird)
Previously, tamird (Tamir Duberstein) wrote…
drop this, it's just repeating code and useless without looking at the code.
Done with the fixes on the PR message
src/linker.rs line 924 at r5 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
I sent #329 to just make this whole thing simpler.
Thanks, I'll take a look and leave my review, once it's merged I can rebase and use it here.
|
Should we wait #329 to resolve this one ? cc @tamird @alessandrod |
|
Please rebase and fix broken CI. Now that we are reading the entire input into memory you shouldn't need to mess with buffered readers and do manual seeking. |
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed 1 of 2 files at r8, 2 of 3 files at r10, 1 of 1 files at r11, all commit messages.
Reviewable status: 4 of 5 files reviewed, 8 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
tests/ir_file_test.rs line 1 at r11 (raw file):
#![expect(unused_crate_dependencies, reason = "used in lib/bin")]
i know we mostly have integration tests, but the linker is now a library -- can this be a unit test?
src/llvm/mod.rs line 152 at r11 (raw file):
/// https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/include/llvm/Support/MemoryBuffer.h#L134 /// Called by `LLVMParseIRInContext` via `parseIR`: /// https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/lib/IRReader/IRReader.cpp#L122
what is this link to? where's the getMemBuffer call?
src/llvm/mod.rs line 155 at r11 (raw file):
/// /// Without the null terminator, LLVM hits an assertion in debug builds: /// https://github.com/llvm/llvm-project/blob/bde90624185ea2cead0a8d7231536e2625d78798/llvm/include/llvm/Support/MemoryBuffer.h#L138
wrong link? this is the same link as the first one
src/linker.rs line 545 at r11 (raw file):
module: &mut LLVMModule<'ctx>, path: &Path, data: &[u8],
can you just pass &mut Vec<u8> here? in the InputType::Ir arm below you can then write
let data = CString::new(mem::take(data)).expect();
...
and you can get rid of the unreachable while you're at it
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 5 files reviewed, 8 unresolved discussions (waiting on @alessandrod and @tamird)
Previously, tamird (Tamir Duberstein) wrote…
s/it/their/
s/a extra/an extra/
s/on`/through `/
Done.
Previously, tamird (Tamir Duberstein) wrote…
s/Rust/rustc/
Done.
src/linker.rs line 545 at r11 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
can you just pass
&mut Vec<u8>here? in theInputType::Irarm below you can then writelet data = CString::new(mem::take(data)).expect(); ...and you can get rid of the
unreachablewhile you're at it
Ok
src/llvm/mod.rs line 152 at r11 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
what is this link to? where's the getMemBuffer call?
updated
src/llvm/mod.rs line 155 at r11 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
wrong link? this is the same link as the first one
deleted
|
I'm not sure why CI is red, I'll investigate tmrw |
This enables bpf-linker to recieve LLVM IR text files directly, to emit their respective bitcode without needing an extra pass through llvm-as to convert them to bitcode first. This is useful for linking IR files generated by any language frontend (not just rustc), making the linker more flexible for mixed-language BPF projects.
|
Ok, now it's good to review @tamird |
tamird
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamird reviewed 1 of 3 files at r12, all commit messages.
Reviewable status: 3 of 5 files reviewed, 5 unresolved discussions (waiting on @alessandrod and @BretasArthur1)
src/llvm/mod.rs line 152 at r11 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
updated
OK so there are deeper reasons here than this -- it turns out that LLVM parsers rely on null termination for a performance optimization. The type safety of these interfaces is utter dog shit. Can you please document this fact that took me a day to chase down?
src/linker.rs line 524 at r13 (raw file):
ty => { info!("linking file {} type {}", path.display(), ty); match link_data(context, &mut module, &path, &mut input.to_vec(), ty) {
ah, we don't have a vector in hand in the LinkerInput::Buffer case. Instead of this, please push back on my suggestion. In this case it seems that we need to use a Cow<_, [T]>
|
Previously, tamird (Tamir Duberstein) wrote…
yep, I had to force the vec |
|
Previously, tamird (Tamir Duberstein) wrote…
wow, ok, sure |
BretasArthur1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 5 files reviewed, 5 unresolved discussions (waiting on @alessandrod and @tamird)
src/llvm/mod.rs line 152 at r11 (raw file):
Previously, BretasArthur1 (Arthur Bretas) wrote…
wow, ok, sure
see if this is descriptive enough
|
hey @tamird @alessandrod any other additions/requests here? |


Context
Recently I was testing generating IR with different languages and using LLVM to generate the bitcode, then using sbpf-linker to generate a .so file
Problem
In the process we need this extra step to generate the .bc with LLVM. I started a thread on X about adding .ll files as parameters for the sbpf-linker and then and alessandro mention the possibility of adding it for bpf-linker and here we are!
This change is