blob: 457c66c83be858a1ae51bf5ebcb60de364f1bc8d [file] [log] [blame] [view]
Devin Jeanpierre2111ede2022-04-25 15:52:46 -07001# Struct Layout
2
3C++ (in the Itanium ABI) extends the C layout rules, and so `repr(C)` isn't
4enough. This pages documents the tweaks to Rust structs to give them the same
5layout as C++ structs.
6
7In particular:
8
9* C++ classes and Rust structs must have the same alignment, so that
10 references can be exchanged without violating the alignment rules. This is
11 usually ensured by the regular `#[repr(C)]` layout algorithm, but sometimes
12 the interop tool needs to generate explicit `#[repr(align(n))]` annotations.
13* C++ classes and Rust structs must have the same size, so that arrays of
14 objects can be exchanged.
15* Public subobjects must have the same offsets in C++ and Rust versions of the
16 structs.
17
18## Empty Structs
19
20In C++, an empty struct or class (e.g. `struct Empty{};`) has size `1`, while in
21Rust, it has size `0`. To make the layout match up, bindings for empty structs
22have a private `MaybeUninit<u8>` field.
23
24(In C++, different array elements are guaranteed to have different addresses,
25and also, arrays are guaranteed to be contiguous. Therefore, no object in C++
26can have size `0`. Rust, like C++, has only contiguous arrays, but unlike C++
27Rust does not guarantee that distinct elements have distinct addresses.)
28
29## Potentially-overlapping objects
30
31In C++, in some circumstances, the requirement that objects do not overlap is
32relaxed: base classes and `[[no_unique_address]]` member variables can have
33subsequent objects live inside of their tail padding. The most famous instance
34of this is the
35[empty base class optimization (EBCO)](https://en.cppreference.com/w/cpp/language/ebo):
36a base class with no data members is permitted to take up zero space inside of
37derived classes.
38
39NOTE: This has other, non-layout consequences for Rust: for example, it is not
40safe to obtain two `&mut` references to overlapping objects, unless they are of
41size `0`. (To prevent this, classes that might be base classes are always
42[`!Unpin`](unpin).)
43
44This is impossible to represent in a C-like struct. (Indeed, it's impossible to
45represent even in a C++-like struct, before the introduction of
46`[[no_unique_address]]`). Therefore, in Rust, we don't even try:
47potentially-overlapping subobjects are replaced in the Rust layout by a
48`[MaybeUninit<u8>; N]` field, where `N` is large enough to ensure that the next
49subobject starts at the correct offset. The alignment of the struct is still
50changed so that it matches the C++ alignment, but via `#[repr(align(n))]`
51instead of by aligning the field.
52
53### Example
54
55For example, consider these two C++ classes:
56
57```c++
58// This is a class, instead of a struct, to ensure that it is not POD for the
59// purpose of layout. (The Itanium ABI disables the overlapping subobject
60// optimization for POD types.)
61class A {
62 int16_t x_;
63 int8_t y_;
64};
65
66struct B final : A {
67 int8_t z;
68}
69```
70
71In memory, this may be laid out as so:
72
73```
74| x_ | x_ | y_ | z |
75 <------------> <->
76 A subobject | B
77<------------------>
78 sizeof(A)
79 (also sizeof(B))
80
81```
82
83The correct representation for `B`, in Rust, is something like this:
84
85```rs
86#[repr(C)]
87#[repr(align(2))] // match the alignment of the int16_t variable.
88struct B {
89 // The We don't use a field of type `A`, because it would have a size of 4,
90 // and Rust wouldn't permit `z` to live inside of it.
91 // Nor do we align the array, for the same reason -- correct alignment must be
92 // achieved via the repr(align(2)) at the top.
93 __base_class_subobjects : [MaybeUninit<u8>; 3];
94 pub z: i8,
95}
96```