From #428 (comment) and DCL38; posted on request by @felix91gr.
Background: C supports “flexible array members”, which indicate that a struct has an unknown number of array elements as its last field:
struct FlexStruct {
int fixed_data;
int flexible_data[];
};
In this case, the C programmer is obligated to write a program that keeps track, in whatever way it sees fit, of what the “actual” length of the array member is (commonly via storing the length in another field). Rust has “custom dynamically sized types (DSTs)” which are superficially similar and can be used for similar purposes,
struct DstStruct {
fixed_data: i32,
flexible_data: [i32],
}
but they are significantly different in that if you create any kind of pointer to DstStruct, it is mandatory that the size of the trailing slice field flexible_data is known, and stored as part of the pointer value (a “fat pointer” or “wide pointer”). Therefore, Rust DSTs cannot currently be used in all of the situations that C FAMs can. Programmers who are porting a C program to Rust, or who are writing FFI bindings to C code, may, therefore, be tempted to solve the problem by declaring a Rust struct without the dynamic size being known to the compiler:
#[repr(C)]
struct FauxDstStruct {
fixed_data: i32,
flexible_data: [i32; 0],
}
It is possible to use this struct (whose nominal size is the size of the fixed fields plus padding) in some cases. However, there is one key thing that is still an unresolved issue in Rust semantics: if you create a reference of type &FauxDstStruct or &mut FauxDstStruct, is it permissible to read or write any bytes of the flexible_data? Currently, no such permission has been granted, and some models of Rust semantics reject it as UB. Therefore, code intended to be correct and to be compiled with current and future Rust compilers must, today, avoid doing this.
Therefore, for the particular audience of experienced C programmers introducing Rust into their development, it may be worth having a guideline recommending that, in the matter of unsafely creating, or accessing through FFI, a variable-length structure in Rust, one of the following techniques must be employed:
- Use a Rust dynamically sized struct, with the correct length. (This requires that the length not change and that wide pointers are suitable for the application.)
- Or, refrain from referring to the struct using Rust reference types; use raw pointers or user-defined wrappers around raw pointers. Raw pointer types impose no assumptions about how much memory following their address is valid to access.
- Or, use “extern types” (types which have unknown, rather than dynamically known, size). This is an unstable feature and thus may reasonably be considered unacceptable to use, but the “sized hierarchy” work currently in progress will hopefully lead to stable extern types in the near future.
Following these rules will avoid the creation of programs which might be determined to have UB by Rust’s future more-nailed-down semantics (or might even have UB today).
(Note that none of these considerations are relevant if unsafe code is not used. Without unsafe code, it is not possible to perform the possibly invalid access, though custom DSTs become far more difficult to create.)
From #428 (comment) and DCL38; posted on request by @felix91gr.
Background: C supports “flexible array members”, which indicate that a
structhas an unknown number of array elements as its last field:In this case, the C programmer is obligated to write a program that keeps track, in whatever way it sees fit, of what the “actual” length of the array member is (commonly via storing the length in another field). Rust has “custom dynamically sized types (DSTs)” which are superficially similar and can be used for similar purposes,
but they are significantly different in that if you create any kind of pointer to
DstStruct, it is mandatory that the size of the trailing slice fieldflexible_datais known, and stored as part of the pointer value (a “fat pointer” or “wide pointer”). Therefore, Rust DSTs cannot currently be used in all of the situations that C FAMs can. Programmers who are porting a C program to Rust, or who are writing FFI bindings to C code, may, therefore, be tempted to solve the problem by declaring a Rust struct without the dynamic size being known to the compiler:It is possible to use this struct (whose nominal size is the size of the fixed fields plus padding) in some cases. However, there is one key thing that is still an unresolved issue in Rust semantics: if you create a reference of type
&FauxDstStructor&mut FauxDstStruct, is it permissible to read or write any bytes of theflexible_data? Currently, no such permission has been granted, and some models of Rust semantics reject it as UB. Therefore, code intended to be correct and to be compiled with current and future Rust compilers must, today, avoid doing this.Therefore, for the particular audience of experienced C programmers introducing Rust into their development, it may be worth having a guideline recommending that, in the matter of
unsafely creating, or accessing through FFI, a variable-length structure in Rust, one of the following techniques must be employed:Following these rules will avoid the creation of programs which might be determined to have UB by Rust’s future more-nailed-down semantics (or might even have UB today).
(Note that none of these considerations are relevant if
unsafecode is not used. Withoutunsafecode, it is not possible to perform the possibly invalid access, though custom DSTs become far more difficult to create.)