blob: a85281d8f865e06ae5842bb74fd1f139b75afef8 [file] [log] [blame] [view]
Devin Jeanpierre2111ede2022-04-25 15:52:46 -07001# Struct Layout
2
3C++ (in the Itanium ABI) extends the C layout rules, and so `repr(C)` isn't
4enough. This pages documents the tweaks to Rust structs to give them the same
5layout as C++ structs.
6
7In particular:
8
9* C++ classes and Rust structs must have the same alignment, so that
10 references can be exchanged without violating the alignment rules. This is
11 usually ensured by the regular `#[repr(C)]` layout algorithm, but sometimes
12 the interop tool needs to generate explicit `#[repr(align(n))]` annotations.
13* C++ classes and Rust structs must have the same size, so that arrays of
14 objects can be exchanged.
15* Public subobjects must have the same offsets in C++ and Rust versions of the
16 structs.
17
Devin Jeanpierre1221c2a2022-05-05 22:36:22 -070018## Non-field data
Devin Jeanpierre2111ede2022-04-25 15:52:46 -070019
Devin Jeanpierre1221c2a2022-05-05 22:36:22 -070020Rust bindings introduce a `__non_field_data: [MaybeUninit<u8>; N]` field to
21cover data within the object that is not part of individual fields. This
22includes:
23
24* Base classes.
25* VTable pointers.
26* Empty struct padding.
27
28### Empty Structs
29
30One notable special case of this is the empty struct padding. An empty struct or
31class (e.g. `struct Empty{};`) has size `1`, while in Rust, it has size `0`. To
32make the layout match up, bindings for empty structs will always enforce that
33the struct has size of at least 1, via `__non_field_data`.
Devin Jeanpierre2111ede2022-04-25 15:52:46 -070034
35(In C++, different array elements are guaranteed to have different addresses,
36and also, arrays are guaranteed to be contiguous. Therefore, no object in C++
37can have size `0`. Rust, like C++, has only contiguous arrays, but unlike C++
38Rust does not guarantee that distinct elements have distinct addresses.)
39
40## Potentially-overlapping objects
41
42In C++, in some circumstances, the requirement that objects do not overlap is
43relaxed: base classes and `[[no_unique_address]]` member variables can have
44subsequent objects live inside of their tail padding. The most famous instance
45of this is the
46[empty base class optimization (EBCO)](https://en.cppreference.com/w/cpp/language/ebo):
47a base class with no data members is permitted to take up zero space inside of
48derived classes.
49
50NOTE: This has other, non-layout consequences for Rust: for example, it is not
51safe to obtain two `&mut` references to overlapping objects, unless they are of
52size `0`. (To prevent this, classes that might be base classes are always
Googler5a7b0522022-08-19 09:51:57 -070053[`!Unpin`](unpin.md).)
Devin Jeanpierre2111ede2022-04-25 15:52:46 -070054
55This is impossible to represent in a C-like struct. (Indeed, it's impossible to
56represent even in a C++-like struct, before the introduction of
57`[[no_unique_address]]`). Therefore, in Rust, we don't even try:
58potentially-overlapping subobjects are replaced in the Rust layout by a
59`[MaybeUninit<u8>; N]` field, where `N` is large enough to ensure that the next
60subobject starts at the correct offset. The alignment of the struct is still
61changed so that it matches the C++ alignment, but via `#[repr(align(n))]`
62instead of by aligning the field.
63
64### Example
65
66For example, consider these two C++ classes:
67
68```c++
69// This is a class, instead of a struct, to ensure that it is not POD for the
70// purpose of layout. (The Itanium ABI disables the overlapping subobject
71// optimization for POD types.)
72class A {
73 int16_t x_;
74 int8_t y_;
75};
76
77struct B final : A {
78 int8_t z;
79}
80```
81
82In memory, this may be laid out as so:
83
84```
85| x_ | x_ | y_ | z |
86 <------------> <->
87 A subobject | B
88<------------------>
89 sizeof(A)
90 (also sizeof(B))
91
92```
93
94The correct representation for `B`, in Rust, is something like this:
95
96```rs
97#[repr(C)]
98#[repr(align(2))] // match the alignment of the int16_t variable.
99struct B {
100 // The We don't use a field of type `A`, because it would have a size of 4,
101 // and Rust wouldn't permit `z` to live inside of it.
102 // Nor do we align the array, for the same reason -- correct alignment must be
103 // achieved via the repr(align(2)) at the top.
Devin Jeanpierre1221c2a2022-05-05 22:36:22 -0700104 __non_field_data : [MaybeUninit<u8>; 3];
Devin Jeanpierre2111ede2022-04-25 15:52:46 -0700105 pub z: i8,
106}
107```