Blame - docs/pre_rfc_separate_size_stride.md - crubit

blob: 7b6dc57e0d57a15ed7c62799829b878e6c8e83ec [file] [log] [blame] [view]

Devin Jeanpierre	1e2bd0d	2022-11-22 16:06:46 -0800	[diff] [blame]	1	# [Pre-RFC] Allow stride != size
				2
				3	## Summary
				4
				5	Rust should allow for values to be placed at the next aligned position after the
				6	previous value, ignoring the tail padding of that previous field. This requires
				7	changing the meaning of "size", so that a value's size in memory (for the
				8	purpose of reference semantics and layout) is not definitionally the same as the
				9	distance between consecutive values of that type (its "stride").
				10
				11	## Motivation
				12
				13	Some other languages (C++ and Swift, in particular) can lay out values more
				14	compactly than Rust conventionally can, leading to better performance at greater
				15	convenience, and less than ideal Rust interoperability.
				16
				17	### Optimization opportunity
				18
				19	Consider the difference between `(u16, u8, u8)` and `((u16, u8), u8)`. The first
				20	can fit in 4 bytes, while the second requires 6. A `(u16, u8)` is a 4 byte value
				21	with 1 byte of tail padding. And a `(T, u8)` can't just stuff the `u8` inside
				22	the tail padding for `T`! If, instead, we declared that `(u16, u8)` were a 3
				23	byte value with alignment 2, then `((u16, u8), u8)` could be 4 bytes instead of
				24	6. This is not possible today.
				25
				26	(For backwards compatibility reasons described later, we can't literally do this
				27	for tuples, but only for user-defined types. But this gives the gist of the
				28	optimization opportunity this proposal supports.)
				29
				30	By inventing the concept of a "data size", which doesn't need to be a multiple
				31	of the alignment, we can allow fields in specially-designed types to be packed
				32	closer together than they would be today, saving space. This is similar to the
				33	performance benefits of `#[repr(packed)]`, but safer: all values would still be
				34	correctly aligned, just placed more closely together.
				35
				36	This optimization has already been implemented in other programming languages.
				37	Swift applies this to every type and every field: a type's size excludes tail
				38	padding, and a neighboring value can be laid out immediately next to it when
				39	stored in the same type, with no padding between the two. In C++, the
				40	optimization automatically applies to base classes
				41	(["EBO"](https://en.cppreference.com/w/cpp/language/ebo), the Empty Base
				42	Optimization), and is opt-in on fields via the
				43	[`[[no_unique_address]]`](https://en.cppreference.com/w/cpp/language/attributes/no_unique_address)
				44	attribute.
				45
				46	For example, here's an example in [Swift](https://godbolt.org/z/G74ejjsvc) and
				47	in [C++](https://godbolt.org/z/4esbYrv39). These types are compact! Rust does
				48	not work like this today.
				49
				50	### Interoperability with C++ and Swift
				51
				52	(Note that the author works on C++ interop, Swift is mentioned for
				53	completeness.)
				54
				55	In fact, exactly because this optimization is already implemented in other
				56	languages, those languages are theoretically not as compatible with Rust as they
				57	are with each other. In C++ and Swift, writing to a pointer or reference does
				58	not write to neighboring fields. But if that pointer or reference were passed to
				59	Rust, and you used any Rust facility to write to it -- whether it were vanilla
				60	assignment or `ptr::write` -- Rust could overwrite that neighboring field.
				61	Because the use of this optimization is pervasive in both Swift and C++,
				62	interoperating with these languages is difficult to do safely.
				63
				64	Concretely, consider the following C++ struct:
				65
				66	```c++
				67	struct MyStruct {
				68	[[no_unique_address]] T1 x;
				69	[[no_unique_address]] T2 y;
				70	...
				71	};
				72	```
				73
				74	Which is equivalent to this Swift struct:
				75
				76	```swift
				77	struct MyStruct {
				78	let x: T1
				79	let y: T2
				80	...
				81	}
				82	```
				83
				84	If you are working with cross-language interop, and obtain in Rust a `&mut T1`
				85	which refers to `x`, and a `&mut T2` which refers to `y`, it may be immediately
				86	UB, because these references can overlap in Rust: `y` may be located inside what
				87	Rust would consider the tail padding of the `T1` reference.
				88
				89	For the same reason, even if you avoid aliasing, if you obtain a `&mut T1` for
				90	`x`, and then write to it, it may partially overwrite `y` with garbage data,
				91	causing unexpected or undefined behavior down the line.
				92
				93	This also cannot be avoided by forbidding the use of `MyStruct`: even if you do
				94	not directly use it from Rust, from the point of view of Swift and C++, it is
				95	just a normal struct, and Swift and C++ codebases can freely pass around
				96	references and pointers to its interior. Someone passing a reference to a `T1`
				97	may have no idea whether it came from `MyStruct` (unsafe to pass to Rust) or an
				98	array (safe). You would need to ban (or correctly handle) any C++ and Swift type
				99	which can have tail padding, in case that padding contains another object.
				100
				101	(To add insult to injury, the struct `MyStruct` itself -- not just references to
				102	fields inside it -- cannot be represented directly as so in Rust, either.)
				103
				104	And anyway, such structs are unavoidable. In Swift, this is the default
				105	behavior, and pervasive. In C++, `[[no_unique_address]]` is permitted to be used
				106	pervasively in the standard library, and it is impractical to only interoperate
				107	with C++ codebases that avoid the standard library.
				108
				109	In order for C++ and Swift pointers/references to be safely representable in
				110	Rust as mut references, a `&mut T1` would need to exclude the tail padding,
				111	which means that Rust would need to separate out the concept of a type's
				112	interior size from its array stride. And in order to represent `MyStruct` in
				113	Rust, we would need a way to use the same layout rules that are available in
				114	these other languages.
				115
				116	## Explanation
				117
				118	(I haven't separated this out to guide-level vs reference-level -- this is a
				119	pre-RFC! Also, all names TBD.)
				120
				121	As a quick summary, the proposal is to introduce the following new traits,
				122	functions, and attributes, and behaviors:
				123
				124	* `std::mem::data_size_of<T>()`, returning the size but not necessarily
				125	rounded to alignment / not necessarily the same as stride.
				126	* In the memory model, pointers and references only refer to
				127	`data_size_of::<T>()` bytes.
				128	* `AlignSized`, a trait for types where the data size and stride are the same.
				129	* `#[repr(compact)]`, to mark a type as not implementing `AlignSized`, and
				130	thus having a potentially smaller data size.
				131	* `#[compact]`, to mark a field as laid out using the data size instead of the
				132	stride.
				133
				134	## Data size vs stride
				135
				136	Semantically, Rust types would gain a new kind of size: "data size". This is the
				137	size of the type, minus the tail padding. In fact, it's in some sense the "true"
				138	size of the type: array stride is the data size rounded up to alignment.
				139
				140	Data size would be exposed via a new function `std::mem::data_size_of::<T>()`;
				141	array stride continues to be returned by `std::mem::size_of::<T>()`.
				142
				143	The semantics of a write (e.g. via `ptr::write`, `mem::swap`, or assignment) are
				144	to only write "data size" number of bytes, and a `&T` or `&mut T` would only
				145	refer to "data size" number of bytes for the purpose of provenance and aliasing
				146	semantics. (`&[T; 1]`, in contrast, continues to refer to `size_of::<T>()`
				147	bytes.)
				148
				149	## The `AlignSized` trait and `std::array::from_ref`
				150
				151	It is fundamentally a backwards-incompatible change to make stride and size not
				152	the same thing, because of functions like
				153	[`std::array::from_ref`](https://doc.rust-lang.org/stable/std/array/fn.from_ref.html)
				154	and
				155	[`std::slice::from_ref`](https://doc.rust-lang.org/stable/std/slice/fn.from_ref.html).
				156	The existence of these functions means that Rust guarantees that for an
				157	arbitrary generic type today, that type has identical size and stride.
				158
				159	This means that if we want to allow for data size and stride to be different,
				160	they must not be different for any generic type as written today. Existing code
				161	without trait bounds can call `from_ref`! So we must add an implicit trait bound
				162	on `AlignSized : Sized`, which, like `Sized`, guarantees that the data size and
				163	the stride are the same. This trait would be automatically implemented for all
				164	pre-existing types, which retain their current layout rules.
				165
				166	In other words, the following two generics are equivalent:
				167
				168	```rs
				169	fn foo<T>() {}
				170	fn foo<T: Sized + AlignSized>() {}
				171	```
				172
				173	... and to opt out of requiring `AlignSized`, one must explicitly remove a trait
				174	bound:
				175
				176	```rs
				177	fn foo2<T: ?AlignSized>() {}
				178	// AlignSized requires Sized, and so this will also do it:
				179	fn foo3<T: ?Sized>() {}
				180	```
				181
				182	To opt out of implementing this trait, and to opt in to being placed closer to
				183	neighboring types inside a compound data structure, types can mark themselves as
				184	`#[repr(compact)]`. This causes the data size not to be rounded up to alignment:
				185
				186	```rs
				187	#[repr(C, compact)]
				188	struct MyCompactType(u16, u8);
				189	// data_size_of::<MyCompactType>() == 3
				190	// size_of::<MyCompactType>() == 4
				191	```
				192
				193	## Taking advantage of non-`AlignSized` types with `#[compact]` storage
				194
				195	If a field is marked `#[compact]`, then the next field is placed after the data
				196	size of that field, not after the stride. (These can only differ for a
				197	non-`AlignSized` type.) This provides easy control, and provides compatibility
				198	with C++, where this behavior can be configured per-field.
				199
				200	It is an error to apply this attribute on non-`#[repr(C)]` types.
				201
				202	```rs
				203	#[repr(C, compact)]
				204	struct MyCompactType(u16, u8);
				205
				206	#[repr(C)]
				207	struct S {
				208	#[compact]
				209	a: MyCompactType, // occupies the first 3 bytes
				210	b: u8, // occupies the 4th byte
				211	}
				212	// data_size_of::<S>() == size_of::<S>() == 4
				213	```
				214
				215	## Example
				216
				217	Putting everything together:
				218
				219	```rs
				220	#[repr(C, compact)]
				221	struct MyCompactType(u16, u8);
				222	// data_size_of::<MyCompactType>() == 3
				223	// size_of::<MyCompactType>() == 4
				224
				225	#[repr(C)]
				226	struct S {
				227	#[compact]
				228	a: MyCompactType, // occupies the first 3 bytes
				229	b: u8, // occupies the 4th byte
				230	}
				231
				232	// data_size_of::<S>() == size_of::<S>() == 4
				233	```
				234
				235	We can take `mut` references to both fields `a` and `b`, and writes to those
				236	references will not overlap:
				237
				238	```rs
				239	let mut x : S = ...;
				240	let S {a, b} = &mut x;
				241	*a = MyCompactType(4, 2); // writes 3 bytes
				242	*b = 0; // writes 1 byte
				243	```
				244
				245	If we had not applied the `repr(compact)` attribute, or had not applied the
				246	`#[compact]` attribute, then `data_size_of<S>()` would have been 6, and so would
				247	`size_of<S>()`. The assignment `*a = ...` would have (potentially) written 4
				248	bytes.
				249
				250	## Drawbacks
				251
				252	### Backwards compatibility and the `AlignSized` trait
				253
				254	In order to be backwards compatible, this change requires a new implicit trait
				255	bound, applied everywhere. However, that makes this change substantially less
				256	useful. If that became the way things worked forever, then `#[repr(compact)]`
				257	types would be very difficult to use, as almost no generic functions would
				258	accept them. Very few functions actually need `AlignSized`, but every generic
				259	function would get it implicitly.
				260
				261	We could change this at an edition boundary: a later edition could drop the
				262	implicit `AlignSized` bound on all generics, and automated migration tooling
				263	could remove the implicit bound from any generic function which doesn't use the
				264	bound, and add an explicit bound for everything that does. After enough
				265	iterations, the only code with a bound on `AlignSized` would be code which
				266	transmutes between `T` and `[T]`/`[T; 1]`. Though this would be a disruptive and
				267	long migration.
				268
				269	Alternatively, we could simply live with `repr(compact)` types being difficult
				270	and usually not usable in generic code. They would still be useful in
				271	non-generic code, and in cross-language interop.
				272
				273	### `alloc::Layout`
				274
				275	`std::alloc::Layout` might not work as is. Consider the following function:
				276
				277	```rs
				278	fn make_c_struct() -> Layout {
				279	Layout::from_size_align(0, 1)?
				280	.extend(Layout::new::<T1>())?.0
				281	.extend(Layout::new::<T2>())?.0
				282	.pad_to_align()
				283	}
				284	```
				285
				286	This function was intended to return a `Layout` that is interchangeable with
				287	this Rust struct:
				288
				289	```rs
				290	#[repr(C)]
				291	struct S {
				292	x: T1,
				293	y: T2,
				294	}
				295	```
				296
				297	In order for this to continue returning the same `Layout`, it must work the same
				298	even if `T1` is changed to be `repr(compact)`. In other words, if `Layout::new`
				299	is to accept `?AlignSized` types, it must use the stride as the size. The same
				300	applies to `for_value*`.
				301
				302	(Alternatively, it may be okay to reject non-`AlignSized` types.)
				303
				304	One assumes, then, that we need `*_compact` versions of all the layout
				305	functions, which use data size instead of stride. And then:
				306
				307	```rs
				308	fn make_c_struct() -> Layout {
				309	Layout::from_size_align(0, 1)?
				310	.extend(Layout::new_compact::<T1>())?.0
				311	.extend(Layout::new::<T2>())?.0
				312	.pad_to_align()
				313	}
				314	```
				315
				316	Would generate the same `Layout` as for the following struct:
				317
				318	```rs
				319	#[repr(C)]
				320	struct S {
				321	#[compact] x: T1,
				322	y: T2,
				323	}
				324	```
				325
				326	Alternatively, perhaps we could introduce separated `data_size` and `stride`
				327	fields into the `Layout`, and have `extend` and `extend_compact`, supplementing
				328	`from_size_align(stride, align)` with `from_data_size_stride_align(data_size,
				329	stride, align)`.
				330
				331	... but this author is very interested to hear opinions about how this should
				332	all work out.
				333
				334	### It's yet another (implicit) size/alignment trait
				335
				336	There is also some desire for
				337	[an `Aligned` trait](https://internals.rust-lang.org/t/aligned-trait/17443) or
				338	[a `DynSized` trait](https://github.com/rust-lang/rust/issues/43467#issuecomment-317733674).
				339	This would be yet another one, which may require changes throughout the Rust
				340	standard library and ecosystem to support everywhere one would ideally hope.
				341
				342	## Rationale and alternatives
				343
				344	### Alternative: manual layout
				345
				346	One could in theory do it all by hand.
				347
				348	#### User-defined padding-less references
				349
				350	Instead of references, one could use `Pin`-like smart pointer types which
				351	forbids direct writes and reads. To avoid aliasing UB, this cannot actually be
				352	`Pin<&mut T>` etc. -- it must be a (wrapper around a) raw pointer, as one must
				353	never actually hold a `&mut T` or even a `&T`. This must be done for all Swift
				354	or C++ types which contain (what Rust would consider) tail padding, unless it is
				355	specifically known that they are held in an array, where it's safe to use Rust
				356	references.
				357
				358	Something like this:
				359
				360	```rs
				361	struct PadlessRefMut<'a, T>(*mut T, PhantomData<&'a mut T>);
				362	```
				363
				364	Unfortunately, today, a generic type like `PadlessRefMut` is difficult to use:
				365	you cannot use it as a `self` type for methods, for instance, though
				366	[there are workarounds](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Extending.20.60arbitrary_self_types.60.20with.20.60UnsafeDeref.60).
				367
				368	Even there, various bits of the Rust ecosystem expect references: for instance,
				369	you can't return a `PadlessRef` or `PadlessRefMut` from an `Index` or `IndexMut`
				370	implementation. This, too, could be fixed by replacing the indexing traits (and
				371	everything else with similar APIs) with a more general trait that uses GATs...
				372	but we can see already that, at least right now, this type would be quite
				373	unpleasant.
				374
				375	#### Layout
				376
				377	For emulating the layout rules of Swift and C++, you could manually lay out
				378	structs (e.g. via a proc macro) and use the same `Pin`-like pointer type:
				379
				380	```rs
				381	// instead of C++:
				382	// `struct Foo {[[no_unique_address]] T1 x; [[no_unique_address]] T2 y; }`
				383	##[repr(C, align( /* max(align_of<T1>(), align_of<T2>()) */ ... ))]
				384	struct Foo {
				385	// These arrays are not of size size_of<T1>() etc., but rather the same as the proposed data_size_of<T1>().
				386	x: [u8; SIZE_OF_T1_DATA],
				387	y: [u8; SIZE_OF_T2_DATA],
				388	}
				389
				390	impl Foo {
				391	fn x_mut(&mut self) -> PadlessRefMut<'_, T1> {
				392	PadlessRefMut::new((&mut self.x).as_mut_ptr() as *mut _)
				393	}
				394	// etc.
				395	}
				396	```
				397
				398	This is especially easy to do when writing a bindings generator, since you can
				399	automatically query the other language's to find the struct layout, and
				400	automatically generate the corresponding Rust.) But otherwise, it's quite a
				401	pain -- one would hope, perhaps, for a proc macro to automate this, similar to
				402	how Rust automatically infers layout for paddingful structs and types.
				403
				404	#### Conclusion: manual layout is unpleasant
				405
				406	Almost nothing is impossible in Rust, including this. But it does mean virtually
				407	abandoning Rust in a practical sense: Rust's references cannot exclude tail
				408	padding, so we use raw pointers instead. Rust's layout rules cannot omit
				409	padding, and so we replace the layout algorithm with a pile of manually placed
				410	`u8` arrays and manually specified alignment. And the result integrates poorly
				411	with the rest of the Rust ecosystem, where most things expect conventional
				412	references, and things that don't or can't use references are difficult to work
				413	with.
				414
				415	### Alternative: `repr(packed)`, but with aligned fields
				416
				417	We could replicate the layout of C++ and Swift structs, but make them very
				418	unsafe to use, similar to `repr(packed)`. One would still, like `repr(packed)`,
				419	avoid taking or using references to fields inside such structs, and these are
				420	still going to be difficult to work with as a result.
				421
				422	## Prior art
				423
				424	### Languages with this feature
				425
				426	Swift: Swift implicitly employs this layout strategy for all types and all
				427	fields. A type has three size-related properties: its "size", meaning the
				428	literal size taken up by its field, not including padding; its "stride", meaning
				429	the difference between addresses of consecutive elements in an array; and its
				430	alignment.
				431
				432	C++: Unlike Swift, C++ does not separate out size and stride into separate
				433	concepts. Instead, it claims that array stride and size are the same thing, as
				434	they are in Rust and C, but that objects can live inside the tail padding of
				435	other objects and that you are simply mutably aliasing into the tail padding in
				436	a way which the language defines the behavior for. C++ nominally allows this for
				437	the tail padding of all types, but only when they are stored in certain places:
				438	objects may be placed inside the tail padding of the previous object when that
				439	previous object is a subobject in the same struct (not, for instance, a separate
				440	local variable), and it is either a base class subobject (so-called "EBO"), or a
				441	`[[no_unique_address]]` data member ("field"). In practice, however, the
				442	compiler is free to not reuse the tail padding for some types. In the
				443	[Itanium ABI](https://itanium-cxx-abi.github.io/cxx-abi/abi.html), C-like
				444	structs ("POD" types, with
				445	[an Itanium-ABI-specific definition of "POD"](https://itanium-cxx-abi.github.io/cxx-abi/abi.html#POD))
				446	do not allow their tail padding to be reused.
				447
				448	### Papers and blog posts
				449
				450	* I worked around this in Crubit, a C++/Rust bindings generator. The design is
				451	here: https://github.com/google/crubit/blob/main/docs/unpin.md . tl;dr: if
				452	we assume that the only source of this layout phenomenon is base classes,
				453	then only non-`final` classes needed to get the uncomfortable `Pin`-like
				454	API. Unfortunately, this does not work if `[[no_unique_address]]` becomes
				455	pervasive.
				456
				457	## Unresolved questions
				458
				459	- What do we do about `std::alloc::Layout`?
				460	- What's the long term future of the `AlignSized` bound?
				461	- Clearly, for compatibility reasons if nothing else, Rust types must not have
				462	reusable tail padding unless specially marked. But what about fields: should
				463	it be opt-in per field (like C++), or automatic (like Swift)? In this doc,
				464	it's assumed to be opt-in per field for `repr(C)` (for C++-compatibility),
				465	and automatic for `repr(Rust)`.
				466	- How free should Rust be to represent fields compactly in `repr(Rust)` types?
				467	- Is `repr(C)` allowed to use this new layout strategy with specially marked
				468	fields using a new attribute, or do we need a new `repr`? The documentation
				469	is
				470	[very prescriptive](https://doc.rust-lang.org/std/mem/fn.size_of.html#size-of-reprc-items).
				471	- This is part of a family of issues with interop, where Rust reference
				472	semantics do not match other languages' reference semantics. (The other
				473	prominent member of the family is "aliasing".) Part of the reason for
				474	wanting to use Rust references is simply the raw ergonomics: generic APIs
				475	take and return `&T`, self types requires `Deref` (which requires
				476	reference-compatible semantics), etc. It is worth asking: rather than
				477	modifying references, does this cross the line to where we should instead
				478	make it more pleasant to use pointers that cannot safely deref?
				479	- "Language lawyering": how does this interact with existing features? For
				480	example, is a `repr(transparent)` type also `repr(compact)`? (I believe
				481	the answer should be yes.)
Lukasz Anforowicz	60ed6c8	2023-04-26 12:05:26 -0700	[diff] [blame]	482	- TODO: better names for everything. For example, `repr(compact)`, "data size"
Devin Jeanpierre	1e2bd0d	2022-11-22 16:06:46 -0800	[diff] [blame]	483	and `data_size_of`. `AlignSized` especially.
				484	- How much of the standard library should be updated to `?AlignSized`?