blob: 7b6dc57e0d57a15ed7c62799829b878e6c8e83ec [file] [log] [blame] [view]
Devin Jeanpierre1e2bd0d2022-11-22 16:06:46 -08001# [Pre-RFC] Allow stride != size
2
3## Summary
4
5Rust should allow for values to be placed at the next aligned position after the
6previous value, ignoring the tail padding of that previous field. This requires
7changing the meaning of "size", so that a value's size in memory (for the
8purpose of reference semantics and layout) is not definitionally the same as the
9distance between consecutive values of that type (its "stride").
10
11## Motivation
12
13Some other languages (C++ and Swift, in particular) can lay out values more
14compactly than Rust conventionally can, leading to better performance at greater
15convenience, and less than ideal Rust interoperability.
16
17### Optimization opportunity
18
19Consider the difference between `(u16, u8, u8)` and `((u16, u8), u8)`. The first
20can fit in 4 bytes, while the second requires 6. A `(u16, u8)` is a 4 byte value
21with 1 byte of tail padding. And a `(T, u8)` can't just stuff the `u8` inside
22the tail padding for `T`! If, instead, we declared that `(u16, u8)` were a **3**
23byte value with alignment 2, then `((u16, u8), u8)` could be 4 bytes instead of
246. This is not possible today.
25
26(For backwards compatibility reasons described later, we can't literally do this
27for tuples, but only for user-defined types. But this gives the gist of the
28optimization opportunity this proposal supports.)
29
30By inventing the concept of a "data size", which doesn't need to be a multiple
31of the alignment, we can allow fields in specially-designed types to be packed
32closer together than they would be today, saving space. This is similar to the
33performance benefits of `#[repr(packed)]`, but safer: all values would still be
34correctly aligned, just placed more closely together.
35
36This optimization has already been implemented in other programming languages.
37Swift applies this to every type and every field: a type's size excludes tail
38padding, and a neighboring value can be laid out immediately next to it when
39stored in the same type, with no padding between the two. In C++, the
40optimization automatically applies to base classes
41(["EBO"](https://en.cppreference.com/w/cpp/language/ebo), the Empty Base
42Optimization), and is opt-in on fields via the
43[`[[no_unique_address]]`](https://en.cppreference.com/w/cpp/language/attributes/no_unique_address)
44attribute.
45
46For example, here's an example in [Swift](https://godbolt.org/z/G74ejjsvc) and
47in [C++](https://godbolt.org/z/4esbYrv39). These types are compact! Rust does
48not work like this today.
49
50### Interoperability with C++ and Swift
51
52(Note that the author works on C++ interop, Swift is mentioned for
53completeness.)
54
55In fact, exactly because this optimization is already implemented in other
56languages, those languages are theoretically not as compatible with Rust as they
57are with each other. In C++ and Swift, writing to a pointer or reference does
58not write to neighboring fields. But if that pointer or reference were passed to
59Rust, and you used any Rust facility to write to it -- whether it were vanilla
60assignment or `ptr::write` -- Rust could overwrite that neighboring field.
61Because the use of this optimization is pervasive in both Swift and C++,
62interoperating with these languages is difficult to do safely.
63
64Concretely, consider the following C++ struct:
65
66```c++
67struct MyStruct {
68 [[no_unique_address]] T1 x;
69 [[no_unique_address]] T2 y;
70 ...
71};
72```
73
74Which is equivalent to this Swift struct:
75
76```swift
77struct MyStruct {
78 let x: T1
79 let y: T2
80 ...
81}
82```
83
84If you are working with cross-language interop, and obtain in Rust a `&mut T1`
85which refers to `x`, and a `&mut T2` which refers to `y`, it may be immediately
86UB, because these references can overlap in Rust: `y` may be located inside what
87Rust would consider the tail padding of the `T1` reference.
88
89For the same reason, even if you avoid aliasing, if you obtain a `&mut T1` for
90`x`, and then write to it, it may partially overwrite `y` with garbage data,
91causing unexpected or undefined behavior down the line.
92
93This also cannot be avoided by forbidding the use of `MyStruct`: even if you do
94not directly use it from Rust, from the point of view of Swift and C++, it is
95just a normal struct, and Swift and C++ codebases can freely pass around
96references and pointers to its interior. Someone passing a reference to a `T1`
97may have no idea whether it came from `MyStruct` (unsafe to pass to Rust) or an
98array (safe). You would need to ban (or correctly handle) any C++ and Swift type
99which can have tail padding, in case that padding contains another object.
100
101(To add insult to injury, the struct `MyStruct` itself -- not just references to
102fields inside it -- cannot be represented directly as so in Rust, either.)
103
104And anyway, such structs are unavoidable. In Swift, this is the default
105behavior, and pervasive. In C++, `[[no_unique_address]]` is permitted to be used
106pervasively in the standard library, and it is impractical to only interoperate
107with C++ codebases that avoid the standard library.
108
109In order for C++ and Swift pointers/references to be safely representable in
110Rust as mut references, a `&mut T1` would need to exclude the tail padding,
111which means that Rust would need to separate out the concept of a type's
112interior size from its array stride. And in order to represent `MyStruct` in
113Rust, we would need a way to use the same layout rules that are available in
114these other languages.
115
116## Explanation
117
118(I haven't separated this out to guide-level vs reference-level -- this is a
119pre-RFC! Also, all names TBD.)
120
121As a quick summary, the proposal is to introduce the following new traits,
122functions, and attributes, and behaviors:
123
124* `std::mem::data_size_of<T>()`, returning the size but not necessarily
125 rounded to alignment / not necessarily the same as stride.
126* In the memory model, pointers and references only refer to
127 `data_size_of::<T>()` bytes.
128* `AlignSized`, a trait for types where the data size and stride are the same.
129* `#[repr(compact)]`, to mark a type as not implementing `AlignSized`, and
130 thus having a potentially smaller data size.
131* `#[compact]`, to mark a field as laid out using the data size instead of the
132 stride.
133
134## Data size vs stride
135
136Semantically, Rust types would gain a new kind of size: "data size". This is the
137size of the type, minus the tail padding. In fact, it's in some sense the "true"
138size of the type: array stride is the data size rounded up to alignment.
139
140Data size would be exposed via a new function `std::mem::data_size_of::<T>()`;
141array stride continues to be returned by `std::mem::size_of::<T>()`.
142
143The semantics of a write (e.g. via `ptr::write`, `mem::swap`, or assignment) are
144to only write "data size" number of bytes, and a `&T` or `&mut T` would only
145refer to "data size" number of bytes for the purpose of provenance and aliasing
146semantics. (`&[T; 1]`, in contrast, continues to refer to `size_of::<T>()`
147bytes.)
148
149## The `AlignSized` trait and `std::array::from_ref`
150
151It is fundamentally a backwards-incompatible change to make stride and size not
152the same thing, because of functions like
153[`std::array::from_ref`](https://doc.rust-lang.org/stable/std/array/fn.from_ref.html)
154and
155[`std::slice::from_ref`](https://doc.rust-lang.org/stable/std/slice/fn.from_ref.html).
156The existence of these functions means that Rust guarantees that for an
157arbitrary generic type today, that type has identical size and stride.
158
159This means that if we want to allow for data size and stride to be different,
160they must not be different for any generic type as written today. Existing code
161without trait bounds can call `from_ref`! So we must add an implicit trait bound
162on `AlignSized : Sized`, which, like `Sized`, guarantees that the data size and
163the stride are the same. This trait would be automatically implemented for all
164pre-existing types, which retain their current layout rules.
165
166In other words, the following two generics are equivalent:
167
168```rs
169fn foo<T>() {}
170fn foo<T: Sized + AlignSized>() {}
171```
172
173... and to opt out of requiring `AlignSized`, one must explicitly remove a trait
174bound:
175
176```rs
177fn foo2<T: ?AlignSized>() {}
178// AlignSized requires Sized, and so this will also do it:
179fn foo3<T: ?Sized>() {}
180```
181
182To opt out of implementing this trait, and to opt in to being placed closer to
183neighboring types inside a compound data structure, types can mark themselves as
184`#[repr(compact)]`. This causes the data size not to be rounded up to alignment:
185
186```rs
187#[repr(C, compact)]
188struct MyCompactType(u16, u8);
189// data_size_of::<MyCompactType>() == 3
190// size_of::<MyCompactType>() == 4
191```
192
193## Taking advantage of non-`AlignSized` types with `#[compact]` storage
194
195If a field is marked `#[compact]`, then the next field is placed after the data
196size of that field, not after the stride. (These can only differ for a
197non-`AlignSized` type.) This provides easy control, and provides compatibility
198with C++, where this behavior can be configured per-field.
199
200It is an error to apply this attribute on non-`#[repr(C)]` types.
201
202```rs
203#[repr(C, compact)]
204struct MyCompactType(u16, u8);
205
206#[repr(C)]
207struct S {
208 #[compact]
209 a: MyCompactType, // occupies the first 3 bytes
210 b: u8, // occupies the 4th byte
211}
212// data_size_of::<S>() == size_of::<S>() == 4
213```
214
215## Example
216
217Putting everything together:
218
219```rs
220#[repr(C, compact)]
221struct MyCompactType(u16, u8);
222// data_size_of::<MyCompactType>() == 3
223// size_of::<MyCompactType>() == 4
224
225#[repr(C)]
226struct S {
227 #[compact]
228 a: MyCompactType, // occupies the first 3 bytes
229 b: u8, // occupies the 4th byte
230}
231
232// data_size_of::<S>() == size_of::<S>() == 4
233```
234
235We can take `mut` references to both fields `a` and `b`, and writes to those
236references will not overlap:
237
238```rs
239let mut x : S = ...;
240let S {a, b} = &mut x;
241*a = MyCompactType(4, 2); // writes 3 bytes
242*b = 0; // writes 1 byte
243```
244
245If we had not applied the `repr(compact)` attribute, **or** had not applied the
246`#[compact]` attribute, then `data_size_of<S>()` would have been 6, and so would
247`size_of<S>()`. The assignment `*a = ...` would have (potentially) written 4
248bytes.
249
250## Drawbacks
251
252### Backwards compatibility and the `AlignSized` trait
253
254In order to be backwards compatible, this change requires a new implicit trait
255bound, applied everywhere. However, that makes this change substantially less
256useful. If that became the way things worked forever, then `#[repr(compact)]`
257types would be very difficult to use, as almost no generic functions would
258accept them. Very few functions *actually* need `AlignSized`, but every generic
259function would get it implicitly.
260
261We could change this at an edition boundary: a later edition could drop the
262implicit `AlignSized` bound on all generics, and automated migration tooling
263could remove the implicit bound from any generic function which doesn't use the
264bound, and add an explicit bound for everything that does. After enough
265iterations, the only code with a bound on `AlignSized` would be code which
266transmutes between `T` and `[T]`/`[T; 1]`. Though this would be a disruptive and
267long migration.
268
269Alternatively, we could simply live with `repr(compact)` types being difficult
270and usually not usable in generic code. They would still be useful in
271non-generic code, and in cross-language interop.
272
273### `alloc::Layout`
274
275`std::alloc::Layout` might not work as is. Consider the following function:
276
277```rs
278fn make_c_struct() -> Layout {
279 Layout::from_size_align(0, 1)?
280 .extend(Layout::new::<T1>())?.0
281 .extend(Layout::new::<T2>())?.0
282 .pad_to_align()
283}
284```
285
286This function was intended to return a `Layout` that is interchangeable with
287this Rust struct:
288
289```rs
290#[repr(C)]
291struct S {
292 x: T1,
293 y: T2,
294}
295```
296
297In order for this to continue returning the same `Layout`, it must work the same
298even if `T1` is changed to be `repr(compact)`. In other words, if `Layout::new`
299is to accept `?AlignSized` types, it must use the stride as the size. The same
300applies to `for_value*`.
301
302(Alternatively, it may be okay to reject non-`AlignSized` types.)
303
304One assumes, then, that we need `*_compact` versions of all the layout
305functions, which use data size instead of stride. And then:
306
307```rs
308fn make_c_struct() -> Layout {
309 Layout::from_size_align(0, 1)?
310 .extend(Layout::new_compact::<T1>())?.0
311 .extend(Layout::new::<T2>())?.0
312 .pad_to_align()
313}
314```
315
316Would generate the same `Layout` as for the following struct:
317
318```rs
319#[repr(C)]
320struct S {
321 #[compact] x: T1,
322 y: T2,
323}
324```
325
326Alternatively, perhaps we could introduce separated `data_size` and `stride`
327fields into the `Layout`, and have `extend` and `extend_compact`, supplementing
328`from_size_align(stride, align)` with `from_data_size_stride_align(data_size,
329stride, align)`.
330
331... but this author is very interested to hear opinions about how this should
332all work out.
333
334### It's yet another (implicit) size/alignment trait
335
336There is also some desire for
337[an `Aligned` trait](https://internals.rust-lang.org/t/aligned-trait/17443) or
338[a `DynSized` trait](https://github.com/rust-lang/rust/issues/43467#issuecomment-317733674).
339This would be yet another one, which may require changes throughout the Rust
340standard library and ecosystem to support everywhere one would ideally hope.
341
342## Rationale and alternatives
343
344### Alternative: manual layout
345
346One could in theory do it all by hand.
347
348#### User-defined padding-less references
349
350Instead of references, one could use `Pin`-like smart pointer types which
351forbids direct writes and reads. To avoid aliasing UB, this cannot actually be
352`Pin<&mut T>` etc. -- it must be a (wrapper around a) raw pointer, as one must
353never actually hold a `&mut T` or even a `&T`. This must be done for *all* Swift
354or C++ types which contain (what Rust would consider) tail padding, unless it is
355specifically known that they are held in an array, where it's safe to use Rust
356references.
357
358Something like this:
359
360```rs
361struct PadlessRefMut<'a, T>(*mut T, PhantomData<&'a mut T>);
362```
363
364Unfortunately, today, a generic type like `PadlessRefMut` is difficult to use:
365you cannot use it as a `self` type for methods, for instance, though
366[there are workarounds](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Extending.20.60arbitrary_self_types.60.20with.20.60UnsafeDeref.60).
367
368Even there, various bits of the Rust ecosystem expect references: for instance,
369you can't return a `PadlessRef` or `PadlessRefMut` from an `Index` or `IndexMut`
370implementation. This, too, could be fixed by replacing the indexing traits (and
371everything else with similar APIs) with a more general trait that uses GATs...
372but we can see already that, at least right now, this type would be quite
373unpleasant.
374
375#### Layout
376
377For emulating the layout rules of Swift and C++, you could manually lay out
378structs (e.g. via a proc macro) and use the same `Pin`-like pointer type:
379
380```rs
381// instead of C++:
382// `struct Foo {[[no_unique_address]] T1 x; [[no_unique_address]] T2 y; }`
383##[repr(C, align( /* max(align_of<T1>(), align_of<T2>()) */ ... ))]
384struct Foo {
385 // These arrays are not of size size_of<T1>() etc., but rather the same as the proposed data_size_of<T1>().
386 x: [u8; SIZE_OF_T1_DATA],
387 y: [u8; SIZE_OF_T2_DATA],
388}
389
390impl Foo {
391 fn x_mut(&mut self) -> PadlessRefMut<'_, T1> {
392 PadlessRefMut::new((&mut self.x).as_mut_ptr() as *mut _)
393 }
394 // etc.
395}
396```
397
398This is especially easy to do when writing a bindings generator, since you can
399automatically query the other language's to find the struct layout, and
400automatically generate the corresponding Rust.) But otherwise, it's quite a
401pain -- one would hope, perhaps, for a proc macro to automate this, similar to
402how Rust automatically infers layout for paddingful structs and types.
403
404#### Conclusion: manual layout is unpleasant
405
406Almost nothing is impossible in Rust, including this. But it does mean virtually
407abandoning Rust in a practical sense: Rust's references cannot exclude tail
408padding, so we use raw pointers instead. Rust's layout rules cannot omit
409padding, and so we replace the layout algorithm with a pile of manually placed
410`u8` arrays and manually specified alignment. And the result integrates poorly
411with the rest of the Rust ecosystem, where most things expect conventional
412references, and things that don't or can't use references are difficult to work
413with.
414
415### Alternative: `repr(packed)`, but with aligned fields
416
417We could replicate the layout of C++ and Swift structs, but make them very
418unsafe to use, similar to `repr(packed)`. One would still, like `repr(packed)`,
419avoid taking or using references to fields inside such structs, and these are
420still going to be difficult to work with as a result.
421
422## Prior art
423
424### Languages with this feature
425
426**Swift:** Swift implicitly employs this layout strategy for all types and all
427fields. A type has three size-related properties: its "size", meaning the
428literal size taken up by its field, not including padding; its "stride", meaning
429the difference between addresses of consecutive elements in an array; and its
430alignment.
431
432**C++:** Unlike Swift, C++ does not separate out size and stride into separate
433concepts. Instead, it claims that array stride and size are the same thing, as
434they are in Rust and C, but that objects can live inside the tail padding of
435other objects and that you are simply mutably aliasing into the tail padding in
436a way which the language defines the behavior for. C++ nominally allows this for
437the tail padding of all types, but only when they are stored in certain places:
438objects may be placed inside the tail padding of the previous object when that
439previous object is a subobject in the same struct (not, for instance, a separate
440local variable), and it is either a base class subobject (so-called "EBO"), or a
441`[[no_unique_address]]` data member ("field"). In practice, however, the
442compiler is free to not reuse the tail padding for some types. In the
443[Itanium ABI](https://itanium-cxx-abi.github.io/cxx-abi/abi.html), C-like
444structs ("POD" types, with
445[an Itanium-ABI-specific definition of "POD"](https://itanium-cxx-abi.github.io/cxx-abi/abi.html#POD))
446do not allow their tail padding to be reused.
447
448### Papers and blog posts
449
450* I worked around this in Crubit, a C++/Rust bindings generator. The design is
451 here: https://github.com/google/crubit/blob/main/docs/unpin.md . tl;dr: if
452 we assume that the only source of this layout phenomenon is base classes,
453 then only non-`final` classes needed to get the uncomfortable `Pin`-like
454 API. Unfortunately, this does not work if `[[no_unique_address]]` becomes
455 pervasive.
456
457## Unresolved questions
458
459- What do we do about `std::alloc::Layout`?
460- What's the long term future of the `AlignSized` bound?
461- Clearly, for compatibility reasons if nothing else, Rust types must not have
462 reusable tail padding unless specially marked. But what about fields: should
463 it be opt-in per field (like C++), or automatic (like Swift)? In this doc,
464 it's assumed to be opt-in per field for `repr(C)` (for C++-compatibility),
465 and automatic for `repr(Rust)`.
466- How free should Rust be to represent fields compactly in `repr(Rust)` types?
467- Is `repr(C)` allowed to use this new layout strategy with specially marked
468 fields using a new attribute, or do we need a new `repr`? The documentation
469 is
470 [very prescriptive](https://doc.rust-lang.org/std/mem/fn.size_of.html#size-of-reprc-items).
471- This is part of a family of issues with interop, where Rust reference
472 semantics do not match other languages' reference semantics. (The other
473 prominent member of the family is "aliasing".) Part of the reason for
474 wanting to use Rust references is simply the raw ergonomics: generic APIs
475 take and return `&T`, self types requires `Deref` (which requires
476 reference-compatible semantics), etc. It is worth asking: rather than
477 modifying references, does this cross the line to where we should instead
478 make it more pleasant to use pointers that cannot safely deref?
479- "Language lawyering": how does this interact with existing features? For
480 example, is a `repr(transparent)` type also `repr(compact)`? (I *believe*
481 the answer should be yes.)
Lukasz Anforowicz60ed6c82023-04-26 12:05:26 -0700482- TODO: better names for everything. For example, `repr(compact)`, "data size"
Devin Jeanpierre1e2bd0d2022-11-22 16:06:46 -0800483 and `data_size_of`. `AlignSized` especially.
484- How much of the standard library should be updated to `?AlignSized`?