commit | cf57d036c2e1b608ca902267fbbdfeb7ee5aa166 | [log] [tgz] |
---|---|---|
author | ron-stripe <ron@stripe.com> | Thu Aug 26 06:07:30 2021 -0700 |
committer | Copybara-Service <copybara-worker@google.com> | Thu Aug 26 06:09:10 2021 -0700 |
tree | 7f28af6e78cab0c0d906b8d41ba7ea8bcdb2257e | |
parent | 071ed6467b332ce9186e3b9c2315e217afd69f07 [diff] |
Remote: Allow disk cache with remote exec This allows usage of remote execution and a local disk cache. Also update the output when using disk and remote cache to indicate which is used. The initial PR creating a combined disk and grpc cache mentioned issues with findMissingDigests for DiskAndRemoteCacheClient, but I noticed a couple oddities and I'm not sure if I'm missing something. [https://cs.opensource.google/bazel/bazel/+/5f4d6995db1eb6a9d35dc163c0150283e830aa3d] The original concern was that we should ignore the disk cache and only pay attention to the remote cache when using DiskAndRemoteCacheClient with remote exec. This makes sense as bazel has to ensure the blobs are available remotely. However, the code in findMissingDigests returns the union of all missing digests (both disk and remote). If it short-circuited the check by only checking the disk cache and only returned the disk-cache result if non-empty, I would understand the concern. But the current code returns the union of the disk-cache result and the remote-cache result. Additionally the current code for DiskCacheClient#findMissingBlobs unconditionally returns _all_ digests as missing. So in essence, the current code is always returning _all_ digests as missing. This seems like a bug due to optimizations. [I'm fixing this in the DiskAndRemoteCacheClient by only calling the remote for find Missing when doing remoteExec] A test to resolve the concern would be to run remote action (which would populate both disk and remote cache). * verify it is in the disk cache. * clear the remote cache of both action cache and blobs * clear disk cache of blobs, but not action cache * run the action again. Another concern is the current code won't upload to the remote caches (specifically the disk cache) when the remote_exec happens. The general assumption is that most remote_exec engines will populate the remote cache themselves so currently there isn't a call to `remoteExecutionService.uploadOutputs` when doing remote exec. It is only there for local spawns. Since the remote disk cache with remote exec is currently disabled, this hasn't come up. The next time the action is attempted, it will be found in the remote_cache and pulled down. With the current PR, the disk_cache will get populated when the ActionResult is pulled from the remote_cache on a future run. I think that is Okay. Testing: sh_tests were added to go through scenarios of having the ActionCache prepopulated for disk and remote and not at all. Closes #13852. PiperOrigin-RevId: 393106615
{Fast, Correct} - Choose two
Build and test software of any size, quickly and reliably.
Speed up your builds and tests: Bazel rebuilds only what is necessary. With advanced local and distributed caching, optimized dependency analysis and parallel execution, you get fast and incremental builds.
One tool, multiple languages: Build and test Java, C++, Android, iOS, Go, and a wide variety of other language platforms. Bazel runs on Windows, macOS, and Linux.
Scalable: Bazel helps you scale your organization, codebase, and continuous integration solution. It handles codebases of any size, in multiple repositories or a huge monorepo.
Extensible to your needs: Easily add support for new languages and platforms with Bazel's familiar extension language. Share and re-use language rules written by the growing Bazel community.
Follow our tutorials:
See CONTRIBUTING.md