Replace Salsa with a home-grown alternative that works with rustc.

This is an adaptation of Łukasz Anforowicz's CL at unknown commit.

(See the followup unknown commit for actually using it in cc_bindings_rs with rustc.)

Alternatives considered:

* We could continue to use Salsa in rs_bindings_from_cc, but not cc_bindings_from_rs.
    * Strictly adds complexity compared to using the same library for both.
* Could patch Salsa to add the feature?
    * Upstream engagement didn't really work out earlier (https://github.com/salsa-rs/salsa/issues/424), and I personally am especially bad at it. (I still need to write that RFC to stabilize slice reference layout...)
    * Changing salsa to do this is difficult, obviously -- or Salsa would already do it. Łukasz also already tried it, and I took a poke through the code around that time too. It isn't trivial at all -- certainly substantially more difficult than this CL.
    * Even if we did it, would mean the improvements go into Salsa 2022, which is unreleased.
* Could use Salsa in cc_bindings_from_rs by using an intermediate IR, which is all `'static`, similar to rs_bindings_from_cc.
    * A lot of work: we use an IR for rs_bindings_from_cc for FFI reasons, and it's very nice and has many neat properties... but it's still a substantial bit of extra code, for relatively little direct engineering benefit. If we do an IR, we should have a strong architectural/strategic need for the rustc/IR separation, not a _tactical_ need like "`Ty` isn't `'static`".
    * Carries many other downsides, as well, like performance costs, maintenance burden, and new opportunities for error due to mismatches.
* Could wrap the rustc types in something that is `'static`. For example, instead of `TyCtxt`, use a `StaticTyCtxt` which is sort of like a weak reference to the `TyCtxt` (replaced with `None` when the original `&TyCtxt` in the moral equivalent of `main()` goes out of scope).
    * I prototyped this approach first. It works fine for `TyCtxt` actually, but works very poorly for `Ty`, as `Ty` does not write down anywhere what `TyCtxt` it comes from, so it is difficult to make it sound. Requires more code just to do this than to rewrite Salsa.
    * If we abandon soundness and e.g. use raw pointers everywhere, it becomes easy, but now we have abandoned soundness and run the risk of UB. Maybe that's fine -- we of course did write some of this in C++ for rs_bindings_from_cc -- but it's worth taking some effort to avoid this.

The easiest thing, requiring the least work, and deleting the most code, is to reimplement the parts of Salsa we care about in ~150 lines of code.

I was very reluctant to do this CL, and it was the very last approach I tried. Reusing existing libraries is great. In fact, one of the reasons I was especially happy with using Salsa is that we _did_ roll our own memoization in the C++ side of rs_bindings_from_cc, and doing so led to bugs, because we were able to hand-tune how memoization works... to our own detriment (e.g. caching a bad result, and then fixing it later. Bad. However, if we hew close to the principles behind libraries like salsa, then perhaps we can get the benefits of this approach, without cursing ourselves:

* Less churn: Salsa is under development, with known backwards-incompatible changes on the way. Keeping up to date with unstable third-party libraries is always a challenge, and we can avoid it here.
* Flexibility: Any features we want, such as not requiring `Eq` on return values, or having non-`'static` inputs, we can add.

---

### Future features

Some possible future features to add, along these lines:

Use `&'tcx T` instead of `Rc<T>` -- instead of returning an `Rc<T>` and cloning it, return a `T`, and tell `query_group` to return that `T` by reference.

Something like:

```rust
query_group! {
  pub trait BindingsGenerator<'a> {
    #[return_by_reference]
    fn foo(&self, x: i32) -> &'a ExpensiveObject;
  }
  struct Database;
}
// actual implementation returns by value, the `&'a` is injected by the automatically generated implementation.
fn foo(db: &dyn BindingsGenerator<'_>, x: i32) -> ExpensiveObject {
  ...
}
```

This can avoid both the expense of an `Rc<T>`, but more importantly, the annoyance/inconvenience.

Buuuuuuuut it would need some unsafe code, I think, or else some tricks, anyway. (Rust doesn't know we never mutate it in the hash map!)

(In principle I think salsa could do this too, it just needs to leak its cached values forever. (Which I think it already does by default.))

Not sure what else.

PiperOrigin-RevId: 645301898
Change-Id: I2deacb94ddc512bc9aaa0f1e5ef48937dfb7a93c
4 files changed
tree: c68674ffab5306a50a18b8fafa20fb12301c4f89
  1. .bazelci/
  2. bazel/
  3. cc_bindings_from_rs/
  4. common/
  5. docs/
  6. examples/
  7. features/
  8. lifetime_analysis/
  9. lifetime_annotations/
  10. migrator/
  11. nullability/
  12. rs_bindings_from_cc/
  13. support/
  14. .bazelrc
  15. .gitignore
  16. BUILD
  17. Cargo.lock
  18. CODE_OF_CONDUCT
  19. CONTRIBUTING
  20. LICENSE
  21. MODULE.bazel
  22. README.md
  23. WORKSPACE.bzlmod
README.md

Crubit: C++/Rust Bidirectional Interop Tool

Build status

NOTE: Crubit currently expects deep integration with the build system, and is difficult to deploy to environments dissimilar to Google's monorepo. We do not have our tooling set up to accept external contributions at this time.

Crubit is a bidirectional bindings generator for C++ and Rust, with the goal of integrating the C++ and Rust ecosystems.

Status

Support for calling FFI-friendly C++ from Rust is in progress.

Support for calling Rust from C++ will arrive in 2024H2.

Example

Consider the following C++ function:

extern "C" bool IsGreater(int lhs, int rhs);

This function, if present in a header file which is processed by Crubit, becomes callable from Rust as if it were defined as:

pub fn IsGreater(lhs: ffi::c_int, rhs: ffi::c_int) -> bool {...}

Note: There are some temporary restrictions on the API shape. For example, functions that are not extern "C", or that accept a type like std::string, can't be called from Rust directly via Crubit. These restrictions will be relaxed over time.

Getting Started

Here are some resources for getting started with Crubit:

  • Rust Bindings for C++ Libraries is a detailed walkthrough on how to use C++ from Rust using Crubit.

  • The examples/cpp/ directory has copy-pastable examples of calling C++ from Rust, together with snapshots of what the generated Rust interface looks like.

Building Crubit

$ apt install clang lld bazel
$ git clone git@github.com:google/crubit.git
$ cd crubit
$ bazel build --linkopt=-fuse-ld=/usr/bin/ld.lld //rs_bindings_from_cc:rs_bindings_from_cc_impl

Using a prebuilt LLVM tree

$ git clone https://github.com/llvm/llvm-project
$ cd llvm-project
$ CC=clang CXX=clang++ cmake -S llvm -B build -DLLVM_ENABLE_PROJECTS='clang' -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=install
$ cmake --build build -j
$ # wait...
$ cmake --install build
$ cd ../crubit
$ LLVM_INSTALL_PATH=../llvm-project/install bazel build //rs_bindings_from_cc:rs_bindings_from_cc_impl