Rust interop design documentation.
PiperOrigin-RevId: 448656933
diff --git a/docs/design.md b/docs/design.md
new file mode 100644
index 0000000..628b893
--- /dev/null
+++ b/docs/design.md
@@ -0,0 +1,649 @@
+## Introduction
+
+This document describes the high-level design choices of Crubit, a C++/Rust
+Bidirectional Interop Tool.
+
+## C++/Rust interop goal
+
+**The primary goal of C++/Rust interop tooling is to enable Rust to be used
+side-by-side with C++ in large existing codebases.**
+
+In the short term we would
+like to focus on codebases that roughly follow the Google C++ style guide to improve the interop
+fidelity. Other, more diverse codebases are possible prospective users
+in the long term, and their needs will be addressed by customization and
+extension points.
+
+## C++/Rust interop requirements
+
+In support of the interop goal, we identify the following requirements:
+
+1. **Enable using existing C++ libraries from Rust with high fidelity**
+ * **High fidelity means that interop will make C++ APIs available in Rust,
+ even when those API projections would not be idiomatic, ergonomic,
+ or safe** in Rust, to facilitate cheap, small step incremental migration
+ workflow. Based on the experience of other cross-language
+ interoperability systems and language migrations (for example,
+ Objective-C/Swift, Java/Kotlin, JavaScript/TypeScript), we believe that
+ working in a mixed C++/Rust codebase would be significantly harder if
+ some C++ APIs were not available in Rust.
+ * **Interop will bridge C++ constructs to Rust constructs only
+ when the semantics match closely**. Bridging large semantic gaps creates
+ a risk of making C++ APIs unusable in Rust, as well as a risk of
+ creating performance problems. For example, interop will not bridge
+ destructive Rust moves and non-destructive C++ moves; instead it will
+ make C++ move constructors and move assignment operators available to
+ use in Rust code. As another example, interop will not bridge C++
+ templates and Rust generics by default.
+ * Interop should be **performant**, as close to having no runtime cost
+ as possible. The performance costs of the interop should be documented,
+ and where possible, intuitive to the user.
+ * Interop should be **ergonomic and safe**, as long as ergonomic and
+ safety accommodations do not hurt performance or fidelity.
+ Where a tradeoff is possible, the interop will choose performance
+ and fidelity over ergonomics; the user will be allowed to override
+ this choice.
+ * **Enable owners of the C++ API to control their Rust API projection**,
+ for example, with attributes in C++ headers and by extending generated
+ bindings with a manually implemented overlay. Such an overlay will
+ wrap or extend generated bindings to improve ergonomics and safety.
+2. **Enable using Rust libraries from C++**
+ * However, using C++ libraries from Rust has a higher priority than
+ using Rust libraries from C++.
+3. **Put little to no barriers to entry**
+ * **Ideally, no boilerplate code** needs to be written in order to
+ start using a C++ library from Rust. Adding some extra information
+ can make the generated bindings more ergonomic to use.
+ * The amount of **duplicated API information is minimized**.
+ * **Future evolution of C++ APIs should be minimally hindered
+ by the presence of Rust users**.
+
+## Proposal and high-level design
+
+**We propose to develop our own C++/Rust interop tooling.** There are no
+existing tools that satisfy all of our requirements. Modifying an existing
+tool to fulfill these requirements would take more effort than building a new
+tool from scratch or might require forking its codebase given that some existing
+tools have goals that conflict with our goals.
+
+See the "alternatives considered" section for a discussion of existing tools.
+
+### Source of information about C++ API
+
+**Interop tooling will read C++ headers**, as they contain the information
+needed to generate Rust API projections and the necessary glue code. Interop
+tooling that is used during builds will not read C++ source files,
+to maintain the principle that C++ API information is only located in headers,
+and that a C++ library can't break the build of its dependencies by
+changing source files.
+
+Some interop-adjacent tools (e.g., large-scale refactoring tools that seed the initial set of
+lifetime annotations) will also read C++ sources. These tools will not be
+used during builds.
+
+**Pros**
+
+* **Minimal barrier to entry**: minimal amount of manual work is
+ required to start using a C++ library from Rust.
+ * Encourages leaf projects to start incrementally adopting Rust in
+ new code, or incrementally rewriting C++ targets in Rust.
+* **C++ API information is located only in headers**,
+ regardless of the language that the API consumer is
+ written in (C++ or Rust).
+* **Interop tooling that generates Rust API projections
+ from a C++ header can get exactly the same information that the C++
+ compiler has** when processing a translation unit that uses one of the APIs
+ declared within that header.
+ * Interop tooling can generate the most performant calls to C++ APIs,
+ without C++-side thunks that translate the C++ ABI into a C ABI.
+ * Interop tooling can autodetect implementation details that are critical
+ for interop but are not a part of the API surface (for example, the size
+ and alignment of C++ classes that have private data members).
+ * In alternative solutions, users need to repeat these implementation
+ details in sidecar files. Interop can verify that the specified
+ information is correct through static assertions in generated C++ code,
+ but the overall user experience is inferior.
+
+**Cons**
+
+* **Having to read C++ headers makes interop tooling more complex.**
+* **The Rust projection of the C++ API is only visible in machine-generated
+ files.**
+ * These are not trivially accessible.
+ * There is a limit on how readable these files can be made.
+ * We can mitigate these issues by building tooling that shows the Rust
+ view of a C++ header (for example in Code Search, or in editors as
+ an alternative go-to-definition target).
+
+### Customizability
+
+Interop tooling will be sufficiently customizable to accommodate the unique
+needs of different C++ libraries in the codebase. Interop should be customizable
+enough to accommodate existing codebases. C++ API owners can:
+
+* **Guide how interop tooling generates Rust API projections
+ from C++ headers**. For example, headers can provide:
+ * Custom Rust names for C++ function overloads
+ (instead of applying the general interop strategy for
+ function overloads),
+ * Custom Rust names for overloaded C++ operators,
+ * Custom Rust lifetimes for pointers and references mentioned in
+ the C++ API,
+ * Nullability information for pointers in the C++ API,
+ * Assertions (verified at compile time) and promises (not verified
+ by tooling) that certain C++ types are trivially relocatable.
+* **Provide custom logic to bridge types**, for example, mapping C++
+ `absl::StatusOr` to Rust `Result`.
+* **Provide API overlays** that improve the automatically generated Rust API.
+ * For example, the overlays could inject additional methods into
+ automatically generated Rust types or hide some of the generated methods.
+
+More intrusive customization techniques will be useful for template and
+macro-heavy libraries where the baseline
+import rules just won't work. We believe customizability will be an essential
+enabler for providing high-fidelity interop.
+
+### Source of additional information that customizes C++ API projection into Rust
+
+Where C++ headers don't already provide all information necessary for interop
+tooling to generate a Rust API projection, we will add such information to C++
+headers whenever possible. If it is not desirable to edit a certain C++ header,
+extra information can be stored in a sidecar file.
+
+Examples of additional information that interop tooling will need:
+
+* **Nullability annotations.** C++ APIs often expose pointers that are
+ documented or assumed by convention to be never null, but can't be
+ refactored to references due to language limitations
+ (for example, `std::vector<MyProtobuf *>`). If C++ headers don't provide
+ nullability information for pointers in a machine-readable form, interop
+ tooling has to conservatively mark all C++ pointers as nullable in the Rust
+ API projection. The Rust compiler will then force users to write unnecessary
+ (and untestable) null checks.
+* **Lifetimes of references and pointers** in C++ headers are not described
+ in a machine-readable way (and sometimes are not even documented in prose).
+ Lifetime information is essential to generate safe and idiomatic Rust APIs
+ from C++ headers.
+
+#### Additional information is stored in C++ headers
+
+**Pros**
+
+* **Additional information needed for C++/Rust interop will be expressed
+ as annotations on existing syntactic elements in C++.**
+ * The annotations are located in the most logical place.
+ * The annotations are more likely to be noticed and updated by
+ C++ API owners.
+ * API owners retain full control over how the API looks in Rust.
+* **C++ users may find lifetime and nullability annotations useful.** For
+ example, information about lifetimes is highly important to C++ and
+ Rust users alike.
+* **C++ API definitions are only written once,** minimizing duplication
+ and maintenance burden.
+
+**Cons**
+
+* **Annotations that benefit Rust users can bother C++ API owners** who don't
+ care about Rust. Especially at the beginning of integrating Rust into an existing codebase, C++ API
+ owners can push back on adding annotations.
+ * To encourage adoption of annotations, we can develop tooling for C++
+ that uses lifetime and nullability annotations to find bugs in C++ code.
+ * The pushback is likely to be short-term: if Rust takes off in a C++ codebase,
+ C++ library owners in that codebase will need to care about Rust users
+ and how their API looks in Rust.
+* **There may be headers that we cannot (or would not want to) change**,
+ for example, headers in third-party code, headers that are open-sourced,
+ or when first-party owners are not cooperating.
+ * We can apply the sidecar strategy to these headers.
+
+#### Additional information is stored in sidecar files
+
+Additional information needed for C++/Rust interop can be stored in sidecar
+files, similarly to Swift APINotes, CLIF etc. If sidecar files get sufficiently
+broad adoption (for example, if annotating third-party code turns out to be sufficiently
+important that optimizing C++/Rust interop ergonomics there
+would be worth it), it would make sense to write sidecar files in a Rust-like
+language, as that provides the most natural way to define Rust APIs.
+
+**Pros**
+
+* **Sidecar files enable more broad adoption of annotations** by providing
+ additional interop information without modifying C++ headers. Sidecar files
+ will allow us to annotate headers in third-party code, headers that can't
+ adopt annotations for technical reasons, or headers owned by first-party
+ owners who are not cooperating.
+
+**Cons**
+
+* Like in the [Use Rust code to customize API projection into Rust](#use-rust-code-to-customize-api-projection-into-rust)
+ alternative, **some part of C++ API information is duplicated**,
+ which is a burden for the C++ API owners.
+* The projection of C++ APIs to Rust is defined in a new language.
+ * C++ API owners and Rust users will have to learn this language.
+ * If we expect wide adoption of sidecar files, we will need to create
+ tooling to parse, edit, and run LSCs against this language.
+* **Annotations in sidecar files are more prone to become out of sync with
+ the C++ code.** When making changes to C++ code, engineers are less likely
+ to notice and update the annotations in sidecar files.
+ * Presubmits can catch some cases of desynchronization between C++ headers
+ and sidecar filles. However, presubmit errors that remind engineers
+ to edit more files create an inferior user experience.
+* **Sidecar files create extra friction to modify the code.** Where previously
+ one had to edit only a C++ header and a C++ source file, now one also likely
+ needs to update a sidecar file.
+ * When engineers realize that they need to update a sidecar file,
+ opening another file and finding the right place to update creates
+ extra friction to modify code.
+ * Once engineers understand the extra maintenance burden associated
+ with sidecar files that tend to go out of sync with headers, they will
+ be less likely to adopt annotations in the first place.
+
+### Glue code generation
+
+C++/Rust interop tooling will generate executable glue code and type definitions
+in Rust and in C++ (not just merely `extern "C"` function declarations) in order
+to achieve the following goals:
+
+* **Enable instantiating C++ templates from Rust, and
+ monomorphizing Rust generics from C++. Enable Rust types to participate in
+ C++ inheritance hierarchies.**
+ * For example, imagine Rust code using an object of type
+ `std::vector<MyProtobuf>`, while C++ code in the same program is
+ never instantiating this type. The Bazel `rust_library` target that mentions this
+ type must therefore be responsible for instantiating this template and
+ linking the resulting executable code into the final program. We propose
+ that this instantiation happens in an automatically generated "glue" C++
+ translation unit that is a part of that `rust_library`.
+* **Enable automatically wrapping C++ code to be more ergonomic in Rust.**
+ For example:
+ * `extern "C"` functions in Rust are necessarily unsafe (it is a language
+ rule). We would like the vast majority of C++ API projections into Rust
+ to be safe. In the current Rust language, we can achieve that only by
+ wrapping the unsafe `extern "C"` function in a safe function marked with
+ `#[inline(always)]`.
+ * C++ API owners can provide rules for automatic type bridging,
+ for example, mapping C++ `absl::StatusOr` to Rust `Result`.
+ This conversion necessitates generation of a Rust wrapper function
+ around a C++ entry point that takes advantage of such type bridging.
+* **Provide stable locations (C++ modules, Rust crates) that "own" the types
+ from the language point of view.**
+ * For example, when we project a C++ type into Rust, its Rust definition
+ must be located in a Rust crate. Furthermore, all Rust users of this
+ type must observe it as being defined in the same crate (otherwise,
+ two identical type definitions in different crates are unrelated types
+ in Rust).
+ * When we project a Rust type into C++ we could repeat its C++
+ definition in C++ code any number of times (for example, in every C++
+ user of a Rust type). This is technically fine because C++ allows
+ the same type to be defined multiple types within a program.
+ Nevertheless, such duplication is error-prone.
+
+### Glue code is generated as C++ and Rust source code
+
+Interop tooling will generate glue code as C++ and Rust source files, which
+are then compiled with an unmodified compiler for that language. The alternative
+is to generate LLVM IR or object files with machine code directly from interop
+tooling.
+
+**Pros**
+
+* **It is easy to inject customizations provided by API owners into generated
+ source code.**
+ * The customizations will be written in the target language, making it
+ (hopefully) intuitive to write them.
+* **Generated source code can be easily inspected by compiler engineers**
+ while debugging interop problems and compiler bugs.
+* **Generated source code can be inspected and understood by interop users,**
+ who are not compiler experts.
+ * LLVM IR wouldn't be meaningful to them.
+* **Generated source code is processed by the regular toolchain like any other
+ code in the project.**
+ * It automatically benefits from all performance optimizations and
+ sanitizers that are newly implemented in Clang and Rust compilers.
+* **We avoid adding a new tool that generates unique LLVM IR patterns.**
+ * We avoid making the job of the C++ toolchain maintainers harder.
+
+**Cons**
+
+* **Interop tooling will be limited to generating LLVM IR and machine code
+ that Clang and Rust compilers can generate.**
+
+### Glue code and API projections will assume implementation details of the target execution environment
+
+To provide the most ergonomic and performant interop, C++/Rust interop tooling
+will allow the target codebase to opt into assuming various implementation
+details of the target execution environment. For example:
+
+* When calling C++ from Rust, interop tooling can either wrap C++ functions in
+ thunks with a C calling convention, or call C++ entry points directly.
+ Thunks cause code bloat and can collectively add up to become a performance
+ problem, so it is desirable to call C++ entry points from Rust directly.
+ Interop tooling can do that only if it may assume a specific target platform
+ and C++ ABI.
+
+Implementation details of the target execution environment that are considered
+stable enough will be reflected in API projections, for example:
+
+* The C++ standard does not specify sizes of integer types
+ (`short`, `int`, `long` etc.) To map them to Rust, interop tooling will need
+ to assume a size that they have on the platform that targets in
+ practice. The alternative would be to create target-agnostic integer types
+ (for example, `Int` in Swift is a strong typedef for `Int32` on 32-bit
+ targets, and `Int64` on 64-bit targets), but this makes it harder to
+ provide idiomatic, transparent, high-performance interop.
+* The C++ standard does not specify whether standard library types like
+ `std::vector` are trivially relocatable; it is an implementation detail.
+ Universal interop tooling would have to conservatively assume
+ non-trivially-relocatable types. Interop tooling specific to certain
+ environments can rely on libc++ providing a trivially-relocatable
+ `std::vector` and project it into Rust in a much more ergonomic way.
+
+**Pros**
+
+* **Interop tooling will generate the most performant code sequences**
+ to call foreign language functions.
+ * If interop tooling generates portable code, it would
+ have some overhead. The overhead can be eliminated by C++
+ and Rust optimizers at least in some cases, but at the cost of increased
+ build times. For example, eliminating thunks would require
+ turning on LTO, which is not fast, and usually only used for release builds.
+ It is much preferable to not generate thunks in the first place,
+ if the target platform does not need them.
+* **Ergonomics of API projections will be improved.**
+ * For example, whether a C++ type is trivially relocatable or not is an
+ implementation detail in C++, transparent to C++ users of that type, but
+ it makes a huge ergonomic difference in the Rust API projection.
+
+**Cons**
+
+* **C++ code will have additional evolution constraints.**
+ * For example, changing a type from trivially relocatable to non-trivially
+ relocatable is a non-API-breaking change for C++ users, but it would
+ break Rust users.
+* **It would be more difficult to switch internal environments to a different
+ C++ standard library.**
+* **Code that is deployed in environments that have
+ incompatible implementation details won't be able to use this C++/Rust
+ interop system.**
+ * Alternatively, these executables would have to bring a suitable
+ execution environment with them (e.g., a copy of libc++).
+
+### Interop tooling should be maintainable and evolvable for a long time
+
+We should design and implement C++/Rust interop tooling in
+such a way that we can maintain and evolve it for more than a decade. If Rust
+becomes tightly integrated into an existing C++ project, specific requirements for interop
+and API projection rules will keep changing. The more Rust adoption we will
+have, the more library and team-specific interop customizations we will have to
+support, and the more it will make sense for the performance team to tweak
+generated code to implement sweeping optimizations. These kinds of changes should
+be readily possible, and they should not create conflicts of interest between
+diferent users of the interop tooling.
+
+### Interop tooling should facilitate C++ to Rust migration
+
+C++/Rust interop tooling should try to create a favorable environment for
+migrating C++ code to Rust. Specifically, projections of C++ APIs into Rust
+should be implementable in Rust. This way, a C++ library can be converted from
+C++ into Rust transparently for its users, as its public API won't change.
+
+## Alternatives Considered: Design decisions
+
+### Repeat C++ API completely in a separate IDL
+
+Instead of reading C++ headers in the interop tooling, we would require the user
+to repeat the C++ API in some other form, for example, in a Rust-based IDL like
+in the cxx crate, or in sidecar files in a completely new format.
+
+**Pros**
+
+* **Interop tooling can be simpler if it does not have to read C++ headers**.
+ But even under this alternative approach, tooling might want to read C++
+ headers, nullifying this advantage. For example, tooling might want to
+ automatically generate an initial Rust snippet or to suggest in presubmits
+ to adjust the Rust code that mirrors a C++ API when that C++ API changes.
+* The **most natural way to define Rust APIs** is by using Rust code or
+ Rust-like syntax in sidecar files.
+* **Available Rust APIs are defined in easily accessible checked-in files.**
+* **API definitions written by a human might have higher quality,
+ on average.**
+
+**Cons**
+
+* **A big part of the C++ API needs to be duplicated** to reliably match the
+ Rust code with the C++ declarations. The initial code can be generated by
+ tooling, but it has to be kept in sync. This is a burden for the C++ API
+ owners, potentially a bigger one than allowing annotations in C++ headers.
+ * There is a risk that C++ API owners might refuse to own IDL files.
+* The need to create a sidecar file creates a **barrier to start using C++
+ libraries from Rust.**
+ * While the duplication overhead is justifiable for widely-used libraries,
+ it is relatively high for libraries with few users and binaries,
+ making it less likely that leaf teams will start adopting Rust.
+* **When the C++ API is changed, the Rust definitions become out-of-sync
+ with it.** Tooling needs to detect this, and the Rust definitions need to be
+ changed (either manually or tool-assisted).
+* There is no effective way to verify Rust binding code at the presubmit time
+ of a C++ library other than building downstream projects.
+* **Mapping Rust API definitions to the original C++ API definitions is more
+ complicated and error-prone**. For example, how would we target a specific
+ overload of a function or constructor?
+* There is a **risk that individual teams will build team-specific tooling
+ that generates IDL files** from C++ headers or generates both IDL files and
+ C++ headers from a single source. These solutions are unlikely to scale to
+ existing large codebases and will likely only work for that specific team.
+
+### Use Rust code to customize API projection into Rust
+
+An alternative to storing additional information in C++ headers is to put it
+into Rust code. For example, the cxx crate requires users to re-state the C++
+API in Rust syntax, adding information about lifetimes and nullability. The pros
+and cons of this choice are the same as when defining a special IDL that repeats
+the C++ API completely (see above).
+
+### Generate glue code in binary formats
+
+Instead of generating glue code as textual sources, interop tooling could use
+Clang and LLVM APIs to emit object files with C++ glue code and use Rust
+compiler APIs to generate rmeta and rlib files with Rust glue code.
+
+**Pros**
+
+* **More flexibility in the code that can be generated.** Controlling LLVM IR
+ generation allows interop tooling to generate code that an unmodified
+ compiler can't generate from textual source code. For example, the Rust
+ language does not have any constructs that map to `linkonce_odr` functions
+ in LLVM IR; if the interop tooling embedded the Rust compiler as a library
+ and had more control over how it generates the IR, we could make that
+ happen.
+
+**Cons**
+
+* Injecting customizations provided by API owners is harder.
+* LLVM, Clang, and Rust compiler APIs are not stable. The format of Rust
+ metadata files is not stable either. The larger the API subset we consume
+ from Clang and Rust, the more difficult it becomes to maintain the tooling.
+* To generate object files the interop tooling has to ensure that its
+ Clang/LLVM version and configuration is identical with the Clang compiler
+ used to build other C++ code.
+ * We can solve this problem, but it makes the system more fragile,
+ compared to using existing C++ and Rust compilers to compile generated
+ sources.
+* From time to time LLVM introduces bugs that cause miscompilations. If interop tooling
+ embeds LLVM, we would be adding another tool that toolchain engineers will need to
+ look into when debugging a miscompilation. We would be making the job of
+ C++ toolchain maintainers harder.
+
+## Alternatives Considered: Existing tools
+
+### bindgen
+
+[bindgen](https://rust-lang.github.io/rust-bindgen/) **automatically generates
+Rust bindings from C and C++ headers**, which it consumes using libclang. The
+generated **bindings are pure Rust code** that interfaces with C and C++ using
+Rust’s [built-in FFI for C](https://doc.rust-lang.org/nomicon/ffi.html)
+(`#[repr(C)]` to indicate that a struct should use C memory layout and
+`extern "C"` to indicate that a function should use a C calling convention). C++
+functions are handled by generating a Rust `extern "C"` function that has the
+same ABI as the C++ function and attaching a `link_name` attribute with
+the mangled name.
+
+See [here](https://manishearth.github.io/blog/2021/02/22/integrating-rust-and-c-plus-plus-in-firefox/)
+for an in-depth description of the use of bindgen in Stylo,
+a Rust component in Firefox.
+
+**Pros**
+
+* **The oldest and the most mature** of the existing C++ interop tools.
+
+**Cons**
+
+* **Deficiencies in safety and ergonomics**, for example:
+ * References are imported as pointers. No lifetimes, no null-safety.
+ * Constructors and destructors are not called automatically.
+ * Overloads are distinguished by a numbered suffix in Rust.
+ These numbers clutter the source code and are hard to remember,
+ as they have no meaning. Adding overloads can change the numbering
+ and hence break Rust callers.
+* It is **impossible to use C++ inline functions and templates**
+ from Rust because of bindgen’s architecture[^1]. The architecture is
+ unlikely to change, and therefore, this is a dealbreaker.
+
+**Evaluation**
+
+bindgen could be used in a project that has
+very limited C++ interop needs. However, creating safe and ergonomic wrappers
+for the generated bindings would require additional effort. Our vision and goals
+for C++ interop are very different from what bindgen provides.
+
+### cxx
+
+[cxx](https://cxx.rs/) generates **Rust bindings for C++ APIs and vice versa**
+from an **interface definition language (IDL) included inline in Rust
+source code.** cxx generates Rust and C++ source code from IDL definitions.
+To check that the IDL definitions match the actual C++ API, cxx inserts static
+assertions[^2] into the generated C++ code; it does not, however, read the C++
+headers itself. cxx contains built-in bindings for various Rust and C++ standard
+library types that are not customizable.
+
+As far as we understand, cxx has the following design constraints and goals:
+
+* **Ship a stable product for its intended audience.**
+ * As a consequence, improvements such as integrating move semantics are
+ not going to be accepted soon. We understand that cxx is not a
+ vehicle for experimentation. cxx maintainers would prefer
+ us to first show that our ideas work in a fork of cxx or in a different
+ system, such as autocxx, and that our improvements pull their weight
+ given the added complexity.
+* **Remain simple and transparent.** There is a limit on the amount of
+ complexity that will be tolerated.
+ * There is a chance that improvements such as modeling C++ move semantics
+ or various attempts at eliminating thunks will not be ever accepted in
+ upstream cxx.
+* **Non-goal: Automatically provide high fidelity interop.**
+ * cxx is designed for the use case of an executable where C++ and Rust
+ parts communicate through a narrow interface.
+* **Non-goal: Automatically provide the most performant interop in as many
+ cases as possible.** For example:
+ * cxx does not attempt to eliminate C++-side thunks. Instead, using LTO
+ is recommended.
+ * cxx considers it acceptable to allocate all objects of "opaque" types
+ on the heap. Users who find these heap allocations unacceptable for
+ performance reasons are expected to implement a different C++ entry
+ point that does not hit this limitation and bind it to Rust instead of
+ the original C++ API. Heap allocation is acceptable for many C++ classes
+ in most environments, but the exceptions are important enough for us
+ that this is a major restriction.
+
+**Pros**
+
+* **Mature and ergonomic enough today for mixing C++ and Rust in existing codebases
+ with limited C++ interop needs.**
+* We avoid being on a tech island.
+
+**Cons**
+
+* cxx’s stability goal makes it **hard to experiment with how the Rust API
+ looks.**
+* **Our goals are unlikely to align well with the goals of the intended
+ user audience of cxx.** We would be pulling cxx in directions that make
+ it a worse product for its current users.
+* **Almost no customizability**. Users who are not satisfied with what cxx
+ does are expected to wrap the target C++ API in a different C++ API that
+ is more friendly to cxx.
+* cxx tries to be compatible with most standard C++ implementations found
+ in the real world, so it **cannot take advantage of unique guarantees
+ provided by the target execution environment.**
+
+**Evaluation**
+
+cxx could be used in projects with limited C++/Rust interop
+requirements. However, we would not be able to implement many interop features
+that we consider essential (for example, move semantics, templates).
+
+### autocxx
+
+[autocxx](https://github.com/google/autocxx) **automatically generates Rust
+bindings from C++ headers**. As the name implies, it automatically generates
+IDL definitions for cxx, which then produces the actual bindings. In addition,
+autocxx generates its own Rust and C++ code to extend the Rust API beyond what
+cxx itself would provide, for example to support passing POD types by value.
+autocxx consumes C++ headers indirectly by first running bindgen on them and
+then parsing the Rust code output by bindgen.
+
+autocxx’s [design goals](https://www.chromium.org/Home/chromium-security/memory-safety/rust-and-c-interoperability)
+are similar to our own in this document.
+
+We did a case study on using an existing project's C++ API from Rust
+using autocxx.
+
+**Pros**
+
+* **Low barrier to entry**: Bindings are generated from C++ headers,
+ no need to write duplicate API definitions.
+* **Ergonomic mappings** for many C++ constructs.
+* **Open to contributions that change the generated Rust APIs** or
+ make architectural changes.
+
+**Cons**
+
+* **Relatively new and immature.**
+* **Cannot (yet) consume complex headers without errors.**
+ We’ve managed to import some actual Spanner headers, but there are still
+ enough outstanding issues that we can’t yet do anything useful with Spanner.
+* **Architecture can make modifications difficult.** autocxx is built on
+ top of two other tools, bindgen and cxx, and the interfaces between these
+ components can make it harder to make a modification than it would be in a
+ monolithic tool. Specifically:
+ * autocxx uses bindgen to generate a description of the C++ API that it
+ can parse easily (as opposed to trying to parse C++ headers either
+ directly or using Clang APIs). Since bindgen was not intended for this
+ purpose, its output lacks some information that autocxx needs,
+ so autocxx [has forked](https://crates.io/crates/autocxx-bindgen)
+ bindgen to adapt it to its needs. The forked version emits additional
+ information about the C++ API in the form of attributes attached
+ to various API elements.
+ * bindgen in turn is built on the libclang API, which doesn’t surface all
+ of the functionality available through Clang’s C++ API. Adding features
+ to libclang requires additional effort and has a 6 month lead time to appear in a stable release (to become eligible to be used from bindgen).
+ * When errors occur, it can be hard to figure out which of the components
+ is responsible.
+ * Adding features can require touching multiple components,
+ which requires commits to multiple repositories.
+
+**Evaluation**
+
+We initially intended to use autocxx to prototype various interop ideas and
+potentially as a basis for a field trial. We still believe this would be
+feasible, but after trying to modify autocxx and its bindgen fork
+during an internal C++/Rust interop study, we feel that autocxx’s complex
+architecture is enough of an impediment that we could achieve our goals with
+less total effort by creating an interop tool from scratch that consists of
+a single codebase and uses the Clang C++ API to directly interface with Clang.
+
+[^1]: Doing so would require either generating C++ source code or interfacing
+deeply enough with Clang to generate object code for inline functions and
+template instantiation.
+
+[^2]: And tricks such as suitable type conversions that force the C++ compiler
+to perform appropriate checks at compile time.