blob: d269f596e627ebf223e0839b84843de9ac2c03ce [file] [log] [blame] [view]
laurentlb4f2991c52020-08-12 11:37:32 -07001# The Bazel Code Base
2
3This document is a description of the code base and how Bazel is structured. It
4is intended for people willing to contribute to Bazel, not for end-users.
5
6## Introduction
7
8The code base of Bazel is large (~350KLOC production code and ~260 KLOC test
9code) and no one is familiar with the whole landscape: everyone knows their
10particular valley very well, but few know what lies over the hills in every
11direction.
12
13In order for people midway upon the journey not to find themselves within a
14forest dark with the straightforward pathway being lost, this document tries to
15give an overview of the code base so that it's easier to get started with
16working on it.
17
18The public version of the source code of Bazel lives on GitHub at
19http://github.com/bazelbuild/bazel . This is not the “source of truth”; it’s
20derived from a Google-internal source tree that contains additional
21functionality that is not useful outside Google. The long term goal is to make
22GitHub the source of truth.
23
24Contributions are accepted through the regular GitHub pull request mechanism,
25and manually imported by a Googler into the internal source tree, then
26re-exported back out to GitHub.
27
28## Client/server architecture
29
30The bulk of Bazel resides in a server process that stays in RAM between builds.
31This allows Bazel to maintain state between builds.
32
33This is why the Bazel command line has two kinds of options: startup and
34command. In a command line like this:
35
36```
37 bazel --host_jvm_args=-Xmx8G build -c opt //foo:bar
38```
39
40Some options (`--host_jvm_args=`) are before the name of the command to be run
41and some are after (`-c opt`); the former kind is called a "startup option" and
42affects the server process as a whole, whereas the latter kind, the "command
43option", only affects a single command.
44
45Each server instance has a single associated source tree ("workspace") and each
46workspace usually has a single active server instance. This can be circumvented
47by specifying a custom output base (see the "Directory layout" section for more
48information).
49
50Bazel is distributed as a single ELF executable that is also a valid .zip file.
51When you type `bazel`, the above ELF executable implemented in C++ (the
52"client") gets control. It sets up an appropriate server process using the
53following steps:
54
551. Checks whether it has already extracted itself. If not, it does that. This
56 is where the implementation of the server comes from.
572. Checks whether there is an active server instance that works: it is running,
58 it has the right startup options and uses the right workspace directory. It
59 finds the running server by looking at the directory `$OUTPUT_BASE/server`
60 where there is a lock file with the port the server is listening on.
613. If needed, kills the old server process
624. If needed, starts up a new server process
63
64After a suitable server process is ready, the command that needs to be run is
65communicated to it over a gRPC interface, then the output of Bazel is piped back
66to the terminal. Only one command can be running at the same time. This is
67implemented using an elaborate locking mechanism with parts in C++ and parts in
68Java. There is some infrastructure for running multiple commands in parallel,
69since the inability to run e.g. `bazel version` in parallel with another command
jingwenf8b2d3b2020-10-02 06:35:24 -070070is somewhat embarrassing. The main blocker is the life cycle of `BlazeModule`s
71and some state in `BlazeRuntime`.
laurentlb4f2991c52020-08-12 11:37:32 -070072
73At the end of a command, the Bazel server transmits the exit code the client
74should return. An interesting wrinkle is the implementation of `bazel run`: the
75job of this command is to run something Bazel just built, but it can't do that
76from the server process because it doesn't have a terminal. So instead it tells
77the client what binary it should exec() and with what arguments.
78
79When one presses Ctrl-C, the client translates it to a Cancel call on the gRPC
80connection, which tries to terminate the command as soon as possible. After the
81third Ctrl-C, the client sends a SIGKILL to the server instead.
82
83The source code of the client is under `src/main/cpp` and the protocol used to
84communicate with the server is in `src/main/protobuf/command_server.proto` .
85
jingwenf8b2d3b2020-10-02 06:35:24 -070086The main entry point of the server is `BlazeRuntime.main()` and the gRPC calls
laurentlb4f2991c52020-08-12 11:37:32 -070087from the client are handled by `GrpcServerImpl.run()`.
88
89## Directory layout
90
91Bazel creates a somewhat complicated set of directories during a build. A full
92description is available
93[here](https://docs.bazel.build/versions/master/output_directories.html).
94
95The "workspace" is the source tree Bazel is run in. It usually corresponds to
96something you checked out from source control.
97
98Bazel puts all of its data under the "output user root". This is usually
99`$HOME/.cache/bazel/_bazel_${USER}`, but can be overridden using the
100`--output_user_root` startup option.
101
102The "install base" is where Bazel is extracted to. This is done automatically
103and each Bazel version gets a subdirectory based on its checksum under the
104install base. It's at `$OUTPUT_USER_ROOT/install` by default and can be changed
105using the `--install_base` command line option.
106
107The "output base" is the place where the Bazel instance attached to a specific
108workspace writes to. Each output base has at most one Bazel server instance
109running at any time. It's usually at `$OUTPUT_USER_ROOT/<checksum of the path
110to the workspace>`. It can be changed using the `--output_base` startup option,
111which is, among other things, useful for getting around the limitation that only
112one Bazel instance can be running in any workspace at any given time.
113
114The output directory contains, among other things:
115
jingwenf8b2d3b2020-10-02 06:35:24 -0700116* The fetched external repositories at `$OUTPUT_BASE/external`.
laurentlb4f2991c52020-08-12 11:37:32 -0700117* The exec root, i.e. a directory that contains symlinks to all the source
118 code for the current build. It's located at `$OUTPUT_BASE/execroot`. During
119 the build, the working directory is `$EXECROOT/<name of main
120 repository>`. We are planning to change this to `$EXECROOT`, although it's a
121 long term plan because it's a very incompatible change.
122* Files built during the build.
123
124## The process of executing a command
125
126Once the Bazel server gets control and is informed about a command it needs to
127execute, the following sequence of events happens:
128
jingwenf8b2d3b2020-10-02 06:35:24 -07001291. `BlazeCommandDispatcher` is informed about the new request. It decides
laurentlb4f2991c52020-08-12 11:37:32 -0700130 whether the command needs a workspace to run in (almost every command except
131 for ones that don't have anything to do with source code, e.g. version or
132 help) and whether another command is running.
133
1342. The right command is found. Each command must implement the interface
jingwenf8b2d3b2020-10-02 06:35:24 -0700135 `BlazeCommand` and must have the `@Command` annotation (this is a bit of an
laurentlb4f2991c52020-08-12 11:37:32 -0700136 antipattern, it would be nice if all the metadata a command needs was
jingwenf8b2d3b2020-10-02 06:35:24 -0700137 described by methods on `BlazeCommand`)
laurentlb4f2991c52020-08-12 11:37:32 -0700138
1393. The command line options are parsed. Each command has different command line
140 options, which are described in the `@Command` annotation.
141
1424. An event bus is created. The event bus is a stream for events that happen
143 during the build. Some of these are exported to outside of Bazel under the
144 aegis of the Build Event Protocol in order to tell the world how the build
145 goes.
146
1475. The command gets control. The most interesting commands are those that run a
148 build: build, test, run, coverage and so on: this functionality is
149 implemented by `BuildTool`.
150
1516. The set of target patterns on the command line is parsed and wildcards like
152 `//pkg:all` and `//pkg/...` are resolved. This is implemented in
153 `AnalysisPhaseRunner.evaluateTargetPatterns()` and reified in Skyframe as
154 `TargetPatternPhaseValue`.
155
1567. The loading/analysis phase is run to produce the action graph (a directed
157 acyclic graph of commands that need to be executed for the build).
158
1598. The execution phase is run. This means running every action required to
160 build the top-level targets that are requested are run.
161
162## Command line options
163
164The command line options for a Bazel invocation are described in an
165`OptionsParsingResult` object, which in turn contains a map from "option
166classes" to the values of the options. An "option class" is a subclass of
167`OptionsBase` and groups command line options together that are related to each
168other. For example:
169
1701. Options related to a programming language (`CppOptions` or `JavaOptions`).
171 These should be a subclass of `FragmentOptions` and are eventually wrapped
172 into a `BuildOptions` object.
1732. Options related to the way Bazel executes actions (`ExecutionOptions`)
174
175These options are designed to be consumed in the analysis phase and (either
176through `RuleContext.getFragment()` in Java or `ctx.fragments` in Starlark).
177Some of them (for example, whether to do C++ include scanning or not) are read
178in the execution phase, but that always requires explicit plumbing since
179`BuildConfiguration` is not available then. For more information, see the
180section “Configurations”.
181
182**WARNING:** We like to pretend that `OptionsBase` instances are immutable and
183use them that way (e.g. as part of `SkyKeys`). This is not the case and
184modifying them is a really good way to break Bazel in subtle ways that are hard
185to debug. Unfortunately, making them actually immutable is a large endeavor.
186(Modifying a `FragmentOptions` immediately after construction before anyone else
187gets a chance to keep a reference to it and before `equals()` or `hashCode()` is
188called on it is okay.)
189
190Bazel learns about option classes in the following ways:
191
1921. Some are hard-wired into Bazel (`CommonCommandOptions`)
1932. From the @Command annotation on each Bazel command
1943. From `ConfiguredRuleClassProvider` (these are command line options related
195 to individual programming languages)
1964. Starlark rules can also define their own options (see
197 [here](https://docs.bazel.build/versions/master/skylark/config.html))
198
199Each option (excluding Starlark-defined options) is a member variable of a
200`FragmentOptions` subclass that has the `@Option` annotation, which specifies
201the name and the type of the command line option along with some help text.
202
203The Java type of the value of a command line option is usually something simple
204(a string, an integer, a Boolean, a label, etc.). However, we also support
205options of more complicated types; in this case, the job of converting from the
206command line string to the data type falls to an implementation of
207`com.google.devtools.common.options.Converter` .
208
209## The source tree, as seen by Bazel
210
211Bazel is in the business of building software, which happens by reading and
212interpreting the source code. The totality of the source code Bazel operates on
213is called "the workspace" and it is structured into repositories, packages and
214rules. A description of these concepts for the users of Bazel is available
215[here](https://docs.bazel.build/versions/master/build-ref.html).
216
217### Repositories
218
219A "repository" is a source tree on which a developer works; it usually
jingwenf8b2d3b2020-10-02 06:35:24 -0700220represents a single project. Bazel's ancestor, Blaze, operated on a monorepo,
laurentlb4f2991c52020-08-12 11:37:32 -0700221i.e. a single source tree that contains all source code used to run the build.
222Bazel, in contrast, supports projects whose source code spans multiple
223repositories. The repository from which Bazel is invoked is called the “main
224repository”, the others are called “external repositories”.
225
226A repository is marked by a file called `WORKSPACE` (or `WORKSPACE.bazel`) in
227its root directory. This file contains information that is "global" to the whole
228build, for example, the set of available external repositories. It works like a
229regular Starlark file which means that one can `load()` other Starlark files.
230This is commonly used to pull in repositories that are needed by a repository
231that's explicitly referenced (we call this the "`deps.bzl` pattern")
232
233Code of external repositories is symlinked or downloaded under
234`$OUTPUT_BASE/external`.
235
236When running the build, the whole source tree needs to be pieced together; this
237is done by SymlinkForest, which symlinks every package in the main repository to
238`$EXECROOT` and every external repository to either `$EXECROOT/external` or
239`$EXECROOT/..` (the former of course makes it impossible to have a package
240called `external` in the main repository; that's why we are migrating away from
241it)
242
243### Packages
244
245Every repository is composed of packages, i.e. a collection of related files and
246a specification of the dependencies. These are specified by a file called
247`BUILD` or `BUILD.bazel`. If both exist, Bazel prefers `BUILD.bazel`; the reason
jingwenf8b2d3b2020-10-02 06:35:24 -0700248why BUILD files are still accepted is that Bazel’s ancestor, Blaze, used this
laurentlb4f2991c52020-08-12 11:37:32 -0700249file name. However, it turned out to be a commonly used path segment, especially
250on Windows, where file names are case-insensitive.
251
252Packages are independent of each other: changes to the BUILD file of a package
253cannot cause other packages to change. The addition or removal of BUILD files
254_can _change other packages, since recursive globs stop at package boundaries
255and thus the presence of a BUILD file stops the recursion.
256
257The evaluation of a BUILD file is called "package loading". It's implemented in
258the class `PackageFactory`, works by calling the Starlark interpreter and
259requires knowledge of the set of available rule classes. The result of package
260loading is a `Package` object. It's mostly a map from a string (the name of a
261target) to the target itself.
262
263A large chunk of complexity during package loading is globbing: Bazel does not
264require every source file to be explicitly listed and instead can run globs
265(e.g. `glob(["**/*.java"])`). Unlike the shell, it supports recursive globs that
266descend into subdirectories (but not into subpackages). This requires access to
267the file system and since that can be slow, we implement all sorts of tricks to
268make it run in parallel and as efficiently as possible.
269
270Globbing is implemented in the following classes:
271
272* `LegacyGlobber`, a fast and blissfully Skyframe-unaware globber
273* `SkyframeHybridGlobber`, a version that uses Skyframe and reverts back to
274 the legacy globber in order to avoid “Skyframe restarts” (described below)
275
276The `Package` class itself contains some members that are exclusively used to
277parse the WORKSPACE file and which do not make sense for real packages. This is
278a design flaw because objects describing regular packages should not contain
279fields that describe something else. These include:
280
281* The repository mappings
282* The registered toolchains
283* The registered execution platforms
284
285Ideally, there would be more separation between parsing the WORKSPACE file from
286parsing regular packages so that `Package`does not need to cater for the needs
287of both. This is unfortunately difficult to do because the two are intertwined
288quite deeply.
289
290### Labels, Targets and Rules
291
292Packages are composed of targets, which have the following types:
293
2941. **Files:** things that are either the input or the output of the build. In
295 Bazel parlance, we call them _artifacts_ (discussed elsewhere). Not all
296 files created during the build are targets; it’s common for an output of
297 Bazel not to have an associated label.
2982. **Rules:** these describe steps to derive its outputs from its inputs. They
299 are generally associated with a programming language (e.g. `cc_library`,
300 `java_library` or `py_library`), but there are some language-agnostic ones
301 (e.g. `genrule` or `filegroup`)
3023. **Package groups:** discussed in the [Visibility](#visibility) section.
303
304The name of a target is called a _Label_. The syntax of labels is
305`@repo//pac/kage:name`, where `repo` is the name of the repository the Label is
306in, `pac/kage` is the directory its BUILD file is in and `name` is the path of
307the file (if the label refers to a source file) relative to the directory of the
308package. When referring to a target on the command line, some parts of the label
309can be omitted:
310
3111. If the repository is omitted, the label is taken to be in the main
312 repository.
3132. If the package part is omitted (e.g. `name` or `:name`), the label is taken
314 to be in the package of the current working directory (relative paths
315 containing uplevel references (..) are not allowed)
316
317A kind of a rule (e.g. "C++ library") is called a "rule class". Rule classes may
318be implemented either in Starlark (the `rule()` function) or in Java (so called
319“native rules”, type `RuleClass`). In the long term, every language-specific
320rule will be implemented in Starlark, but some legacy rule families (e.g. Java
321or C++) are still in Java for the time being.
322
323Starlark rule classes need to be imported at the beginning of BUILD files using
324the `load()` statement, whereas Java rule classes are "innately" known by Bazel,
325by virtue of being registered with the `ConfiguredRuleClassProvider`.
326
327Rule classes contain information such as:
328
3291. Its attributes (e.g., `srcs`, `deps`): their types, default values,
330 constraints, etc.
3312. The configuration transitions and aspects attached to each attribute, if any
3323. The implementation of the rule
3334. The transitive info providers the rule "usually" creates
334
335**Terminology note:** In the code base, we often use “Rule” to mean the target
336created by a rule class. But in Starlark and in user-facing documentation,
337“Rule” should be used exclusively to refer to the rule class itself; the target
338is just a “target”. Also note that despite `RuleClass` having “class” in its
339name, there is no Java inheritance relationship between a rule class and targets
340of that type.
341
342## Skyframe
343
344The evaluation framework underlying Bazel is called Skyframe. Its model is that
345everything that needs to be built during a build is organized into a directed
346acyclic graph with edges pointing from any pieces of data to its dependencies,
347that is, other pieces of data that need to be known to construct it.
348
349The nodes in the graph are called `SkyValue`s and their names are called
350`SkyKey`s. Both are deeply immutable, i.e. only immutable objects should be
351reachable from them. This invariant almost always holds, and in case it doesn't
352(e.g. for the individual options classes `BuildOptions`, which is a member of
353`BuildConfigurationValue` and its `SkyKey`) we try really hard not to change
354them or to change them in only ways that are not observable from the outside.
355From this it follows that everything that is computed within Skyframe (e.g.
356configured targets) must also be immutable.
357
358The most convenient way to observe the Skyframe graph is to run `bazel dump
359--skyframe=detailed`, which dumps the graph, one `SkyValue` per line. It's best
360to do it for tiny builds, since it can get pretty large.
361
362Skyframe lives in the `com.google.devtools.build.skyframe` package. The
363similarly-named package `com.google.devtools.build.lib.skyframe` contains the
364implementation of Bazel on top of Skyframe. More information about Skyframe is
365available [here](https://bazel.build/designs/skyframe.html).
366
367Generating a new `SkyValue` involves the following steps:
368
3691. Running the associated `SkyFunction`
3702. Declaring the dependencies (i.e. `SkyValue`s) that the `SkyFunction` needs
371 to do its job. This is done by calling the various overloads of
372 `SkyFunction.Environment.getValue()`.
3733. If a dependency is not available, Skyframe signals that by returning null
374 from `getValue()`. In this case, the `SkyFunction` is expected to yield
375 control to Skyframe by returning null, then Skyframe evaluates the
376 dependencies that haven't been evaluated yet and calls the `SkyFunction`
377 again, thus going back to (1).
3784. Constructing the resulting `SkyValue`
379
380A consequence of this is that if not all dependencies are available in (3), the
381function needs to be completely restarted and thus computation needs to be
382re-done. This is obviously inefficient. We work around this in a number of ways:
383
3841. Declaring dependencies of `SkyFunction`s in groups so that if a function
385 has, say, 10 dependencies, it only needs to restart once instead of ten
386 times.
3872. Splitting `SkyFunction`s so that one function does not need to be restarted
388 many times. This has the side effect of interning data into Skyframe that
389 may be internal to the `SkyFunction`, thus increasing memory use.
3903. Using caches "behind the back of Skyframe" to keep state (e.g. the state of
391 actions being executed in `ActionExecutionFunction.stateMap` . In the
392 extreme, this ends up resulting in writing code in continuation-passing
393 style (e.g. action execution), which does not help readability.
394
395Of course, these are all just workarounds for the limitations of Skyframe, which
396is mostly a consequence of the fact that Java doesn't support lightweight
397threads and that we routinely have hundreds of thousands of in-flight Skyframe
398nodes.
399
400## Starlark
401
402Starlark is the domain-specific language people use to configure and extend
403Bazel. It's conceived as a restricted subset of Python that has far fewer types,
404more restrictions on control flow, and most importantly, strong immutability
405guarantees to enable concurrent reads. It is not Turing-complete, which
406discourages some (but not all) users from trying to accomplish general
407programming tasks within the language.
408
409Starlark is implemented in the `com.google.devtools.build.lib.syntax` package.
410It also has an independent Go implementation
411[here](https://github.com/google/starlark-go). The Java implementation used in
412Bazel is currently an interpreter.
413
414Starlark is used in four contexts:
415
4161. **The BUILD language.** This is where new rules are defined. Starlark code
417 running in this context only has access to the contents of the BUILD file
418 itself and Starlark files loaded by it.
4192. **Rule definitions.** This is how new rules (e.g. support for a new
420 language) are defined. Starlark code running in this context has access to
421 the configuration and data provided by its direct dependencies (more on this
422 later).
4233. **The WORKSPACE file.** This is where external repositories (code that's not
424 in the main source tree) are defined.
4254. **Repository rule definitions.** This is where new external repository types
426 are defined. Starlark code running in this context can run arbitrary code on
427 the machine where Bazel is running, and reach outside the workspace.
428
429The dialects available for BUILD and .bzl files are slightly different because
430they express different things. A list of differences is available
431[here](https://docs.bazel.build/versions/master/skylark/language.html#differences-between-build-and-bzl-files).
432
433More information about Starlark is available
434[here](https://docs.bazel.build/versions/master/skylark/language.html).
435
436## The loading/analysis phase
437
438The loading/analysis phase is where Bazel determines what actions are needed to
439build a particular rule. Its basic unit is a "configured target", which is,
440quite sensibly, a (target, configuration) pair.
441
442It's called the "loading/analysis phase" because it can be split into two
443distinct parts, which used to be serialized, but they can now overlap in time:
444
4451. Loading packages, that is, turning BUILD files into the `Package` objects
446 that represent them
4472. Analyzing configured targets, that is, running the implementation of the
448 rules to produce the action graph
449
450Each configured target in the transitive closure of the configured targets
451requested on the command line must be analyzed bottom-up, i.e. leaf nodes first,
452then up to the ones on the command line. The inputs to the analysis of a single
453configured target are:
454
4551. **The configuration.** ("how" to build that rule; for example, the target
456 platform but also things like command line options the user wants to be
457 passed to the C++ compiler)
4582. **The direct dependencies.** Their transitive info providers are available
459 to the rule being analyzed. They are called like that because they provide a
460 "roll-up" of the information in the transitive closure of the configured
461 target, e.g. all the .jar files on the classpath or all the .o files that
462 need to be linked into a C++ binary)
4633. **The target itself**. This is the result of loading the package the target
464 is in. For rules, this includes its attributes, which is usually what
465 matters.
4664. **The implementation of the configured target.** For rules, this can either
467 be in Starlark or in Java. All non-rule configured targets are implemented
468 in Java.
469
470The output of analyzing a configured target is:
471
4721. The transitive info providers that configured targets that depend on it can
473 access
4742. The artifacts it can create and the actions that produce them.
475
476The API offered to Java rules is `RuleContext`, which is the equivalent of the
477`ctx` argument of Starlark rules. Its API is more powerful, but at the same
478time, it's easier to do Bad Things™, for example to write code whose time or
479space complexity is quadratic (or worse), to make the Bazel server crash with a
480Java exception or to violate invariants (e.g. by inadvertently modifying an
481`Options` instance or by making a configured target mutable)
482
483The algorithm that determines the direct dependencies of a configured target
484lives in `DependencyResolver.dependentNodeMap()`.
485
486### Configurations
487
488Configurations are the "how" of building a target: for what platform, with what
489command line options, etc.
490
491The same target can be built for multiple configurations in the same build. This
492is useful, for example, when the same code is used for a tool that's run during
493the build and for the target code and we are cross-compiling or when we are
494building a fat Android app (one that contains native code for multiple CPU
495architectures)
496
497Conceptually, the configuration is a `BuildOptions` instance. However, in
498practice, `BuildOptions` is wrapped by `BuildConfiguration` that provides
499additional sundry pieces of functionality. It propagates from the top of the
500dependency graph to the bottom. If it changes, the build needs to be
501re-analyzed.
502
503This results in anomalies like having to re-analyze the whole build if e.g. the
504number of requested test runs changes, even though that only affects test
505targets (we have plans to "trim" configurations so that this is not the case,
506but it's not ready yet)
507
508When a rule implementation needs part of the configuration, it needs to declare
509it in its definition using `RuleClass.Builder.requiresConfigurationFragments()`
510. This is both to avoid mistakes (e.g. Python rules using the Java fragment) and
511to facilitate configuration trimming so that e.g. if Python options change, C++
512targets don't need to be re-analyzed.
513
514The configuration of a rule is not necessarily the same as that of its "parent"
515rule. The process of changing the configuration in a dependency edge is called a
516"configuration transition". It can happen in two places:
517
5181. On a dependency edge. These transitions are specified in
519 `Attribute.Builder.cfg()` and are functions from a `Rule` (where the
520 transition happens) and a `BuildOptions` (the original configuration) to one
521 or more `BuildOptions` (the output configuration).
5222. On any incoming edge to a configured target. These are specified in
523 `RuleClass.Builder.cfg()`.
524
525The relevant classes are `TransitionFactory` and `ConfigurationTransition`.
526
527Configuration transitions are used, for example:
528
5291. To declare that a particular dependency is used during the build and it
530 should thus be built in the execution architecture
5312. To declare that a particular dependency must be built for multiple
532 architectures (e.g. for native code in fat Android APKs)
533
534If a configuration transition results in multiple configurations, it's called a
535_split transition._
536
537Configuration transitions can also be implemented in Starlark (documentation
538[here](https://docs.bazel.build/versions/master/skylark/config.html))
539
540### Transitive info providers
541
542Transitive info providers are a way (and the _only _way) for configured targets
543to tell things about other configured targets that depend on it. The reason why
544"transitive" is in their name is that this is usually some sort of roll-up of
545the transitive closure of a configured target.
546
547There is generally a 1:1 correspondence between Java transitive info providers
548and Starlark ones (the exception is `DefaultInfo` which is an amalgamation of
549`FileProvider`, `FilesToRunProvider` and `RunfilesProvider` because that API was
550deemed to be more Starlark-ish than a direct transliteration of the Java one).
551Their key is one of the following things:
552
5531. A Java Class object. This is only available for providers that are not
554 accessible from Starlark. These providers are a subclass of
555 `TransitiveInfoProvider`.
5562. A string. This is legacy and heavily discouraged since it's susceptible to
557 name clashes. Such transitive info providers are direct subclasses of
558 `build.lib.packages.Info` .
5593. A provider symbol. This can be created from Starlark using the `provider()`
560 function and is the recommended way to create new providers. The symbol is
561 represented by a `Provider.Key` instance in Java.
562
563New providers implemented in Java should be implemented using `BuiltinProvider`.
564`NativeProvider` is deprecated (we haven't had time to remove it yet) and
565`TransitiveInfoProvider` subclasses cannot be accessed from Starlark.
566
567### Configured targets
568
569Configured targets are implemented as `RuleConfiguredTargetFactory`. There is a
570subclass for each rule class implemented in Java. Starlark configured targets
571are created through `SkylarkRuleConfiguredTargetUtil.buildRule()` .
572
573Configured target factories should use `RuleConfiguredTargetBuilder` to
574construct their return value. It consists of the following things:
575
5761. Their `filesToBuild`, i.e. the hazy concept of "the set of files this rule
577 represents". These are the files that get built when the configured target
578 is on the command line or in the srcs of a genrule.
5792. Their runfiles, regular and data.
5803. Their output groups. These are various "other sets of files" the rule can
581 build. They can be accessed using the output\_group attribute of the
582 filegroup rule in BUILD and using the `OutputGroupInfo` provider in Java.
583
584### Runfiles
585
586Some binaries need data files to run. A prominent example is tests that need
587input files. This is represented in Bazel by the concept of "runfiles". A
588"runfiles tree" is a directory tree of the data files for a particular binary.
589It is created in the file system as a symlink tree with individual symlinks
590pointing to the files in the source of output trees.
591
592A set of runfiles is represented as a `Runfiles` instance. It is conceptually a
593map from the path of a file in the runfiles tree to the `Artifact` instance that
594represents it. It's a little more complicated than a single `Map` for two
595reasons:
596
597* Most of the time, the runfiles path of a file is the same as its execpath.
598 We use this to save some RAM.
599* There are various legacy kinds of entries in runfiles trees, which also need
600 to be represented.
601
602Runfiles are collected using `RunfilesProvider`: an instance of this class
603represents the runfiles a configured target (e.g. a library) and its transitive
604closure needs and they are gathered like a nested set (in fact, they are
605implemented using nested sets under the cover): each target unions the runfiles
606of its dependencies, adds some of its own, then sends the resulting set upwards
607in the dependency graph. A `RunfilesProvider` instance contains two `Runfiles`
608instances, one for when the rule is depended on through the "data" attribute and
609one for every other kind of incoming dependency. This is because a target
610sometimes presents different runfiles when depended on through a data attribute
611than otherwise. This is undesired legacy behavior that we haven't gotten around
612removing yet.
613
614Runfiles of binaries are represented as an instance of `RunfilesSupport`. This
615is different from `Runfiles` because `RunfilesSupport` has the capability of
616actually being built (unlike `Runfiles`, which is just a mapping). This
617necessitates the following additional components:
618
619* **The input runfiles manifest.** This is a serialized description of the
620 runfiles tree. It is used as a proxy for the contents of the runfiles tree
621 and Bazel assumes that the runfiles tree changes if and only if the contents
622 of the manifest change.
623* **The output runfiles manifest.** This is used by runtime libraries that
624 handle runfiles trees, notably on Windows, which sometimes doesn't support
625 symbolic links.
626* **The runfiles middleman.** In order for a runfiles tree to exist, one needs
627 to build the symlink tree and the artifact the symlinks point to. In order
628 to decrease the number of dependency edges, the runfiles middleman can be
629 used to represent all these.
630* **Command line arguments** for running the binary whose runfiles the
631 `RunfilesSupport` object represents.
632
633### Aspects
634
635Aspects are a way to "propagate computation down the dependency graph". They are
636described for users of Bazel
637[here](https://docs.bazel.build/versions/master/skylark/aspects.html). A good
638motivating example is protocol buffers: a `proto_library` rule should not know
639about any particular language, but building the implementation of a protocol
640buffer message (the “basic unit” of protocol buffers) in any programming
641language should be coupled to the `proto_library` rule so that if two targets in
642the same language depend on the same protocol buffer, it gets built only once.
643
644Just like configured targets, they are represented in Skyframe as a `SkyValue`
645and the way they are constructed is very similar to how configured targets are
646built: they have a factory class called `ConfiguredAspectFactory` that has
647access to a `RuleContext`, but unlike configured target factories, it also knows
648about the configured target it is attached to and its providers.
649
650The set of aspects propagated down the dependency graph is specified for each
651attribute using the `Attribute.Builder.aspects()` function. There are a few
652confusingly-named classes that participate in the process:
653
6541. `AspectClass` is the implementation of the aspect. It can be either in Java
655 (in which case it's a subclass) or in Starlark (in which case it's an
656 instance of `SkylarkAspectClass`). It's analogous to
657 `RuleConfiguredTargetFactory`.
6582. `AspectDefinition` is the definition of the aspect; it includes the
659 providers it requires, the providers it provides and contains a reference to
660 its implementation, i.e. the appropriate `AspectClass` instance. It's
661 analogous to `RuleClass`.
6623. `AspectParameters` is a way to parametrize an aspect that is propagated down
663 the dependency graph. It's currently a string to string map. A good example
664 of why it's useful is protocol buffers: if a language has multiple APIs, the
665 information as to which API the protocol buffers should be built for should
666 be propagated down the dependency graph.
6674. `Aspect` represents all the data that's needed to compute an aspect that
668 propagates down the dependency graph. It consists of the aspect class, its
669 definition and its parameters.
6705. `RuleAspect` is the function that determines which aspects a particular rule
671 should propagate. It's a `Rule` -> `Aspect` function.
672
673A somewhat unexpected complication is that aspects can attach to other aspects;
674for example, an aspect collecting the classpath for a Java IDE will probably
675want to know about all the .jar files on the classpath, but some of them are
676protocol buffers. In that case, the IDE aspect will want to attach to the
677(`proto_library` rule + Java proto aspect) pair.
678
679The complexity of aspects on aspects is captured in the class
680`AspectCollection`.
681
682### Platforms and toolchains
683
684Bazel supports multi-platform builds, that is, builds where there may be
685multiple architectures where build actions run and multiple architectures for
686which code is built. These architectures are referred to as _platforms_ in Bazel
687parlance (full documentation
688[here](https://docs.bazel.build/versions/master/platforms.html))
689
690A platform is described by a key-value mapping from _constraint settings_ (e.g.
691the concept of "CPU architecture") to _constraint values_ (e.g. a particular CPU
692like x86\_64). We have a "dictionary" of the most commonly used constraint
693settings and values in the `@platforms` repository.
694
695The concept of _toolchain_ comes from the fact that depending on what platforms
696the build is running on and what platforms are targeted, one may need to use
697different compilers; for example, a particular C++ toolchain may run on a
698specific OS and be able to target some other OSes. Bazel must determine the C++
699compiler that is used based on the set execution and target platform
700(documentation for toolchains
701[here](https://docs.bazel.build/versions/master/toolchains.html)).
702
703In order to do this, toolchains are annotated with the set of execution and
704target platform constraints they support. In order to do this, the definition of
705a toolchain are split into two parts:
706
7071. A `toolchain()` rule that describes the set of execution and target
708 constraints a toolchain supports and tells what kind (e.g. C++ or Java) of
709 toolchain it is (the latter is represented by the `toolchain_type()` rule)
7102. A language-specific rule that describes the actual toolchain (e.g.
711 `cc_toolchain()`)
712
713This is done in this way because we need to know the constraints for every
714toolchain in order to do toolchain resolution and language-specific
715`*_toolchain()` rules contain much more information than that, so they take more
716time to load.
717
718Execution platforms are specified in one of the following ways:
719
7201. In the WORKSPACE file using the `register_execution_platforms()` function
7212. On the command line using the --extra\_execution\_platforms command line
722 option
723
724The set of available execution platforms is computed in
725`RegisteredExecutionPlatformsFunction` .
726
727The target platform for a configured target is determined by
728`PlatformOptions.computeTargetPlatform()` . It's a list of platforms because we
729eventually want to support multiple target platforms, but it's not implemented
730yet.
731
732The set of toolchains to be used for a configured target is determined by
733`ToolchainResolutionFunction`. It is a function of:
734
735* The set of registered toolchains (in the WORKSPACE file and the
736 configuration)
737* The desired execution and target platforms (in the configuration)
738* The set of toolchain types that are required by the configured target (in
739 `UnloadedToolchainContextKey)`
740* The set of execution platform constraints of the configured target (the
741 `exec_compatible_with` attribute) and the configuration
742 (`--experimental_add_exec_constraints_to_targets`), in
743 `UnloadedToolchainContextKey`
744
745Its result is an `UnloadedToolchainContext`, which is essentially a map from
746toolchain type (represented as a `ToolchainTypeInfo` instance) to the label of
747the selected toolchain. It's called "unloaded" because it does not contain the
748toolchains themselves, only their labels.
749
750Then the toolchains are actually loaded using `ResolvedToolchainContext.load()`
751and used by the implementation of the configured target that requested them.
752
753We also have a legacy system that relies on there being one single "host"
754configuration and target configurations being represented by various
755configuration flags, e.g. `--cpu` . We are gradually transitioning to the above
756system. In order to handle cases where people rely on the legacy configuration
757values, we have implemented
758"[platform mappings](https://docs.google.com/document/d/1Vg_tPgiZbSrvXcJ403vZVAGlsWhH9BUDrAxMOYnO0Ls)"
759to translate between the legacy flags and the new-style platform constraints.
760Their code is in `PlatformMappingFunction` and uses a non-Starlark "little
761language".
762
763### Constraints
764
765Sometimes one wants to designate a target as being compatible with only a few
766platforms. Bazel has (unfortunately) multiple mechanisms to achieve this end:
767
768* Rule-specific constraints
769* `environment_group()` / `environment()`
770* Platform constraints
771
772Rule-specific constraints are mostly used within Google for Java rules; they are
773on their way out and they are not available in Bazel, but the source code may
774contain references to it. The attribute that governs this is called
775`constraints=` .
776
777#### environment_group() and environment()
778
779These rules are a legacy mechanism and are not widely used.
780
781All build rules can declare which "environments" they can be built for, where a
782"environment" is an instance of the `environment()` rule.
783
784There are various ways supported environments can be specified for a rule:
785
7861. Through the `restricted_to=` attribute. This is the most direct form of
787 specification; it declares the exact set of environments the rule supports
788 for this group.
7892. Through the `compatible_with=` attribute. This declares environments a rule
790 supports in addition to "standard" environments that are supported by
791 default.
7923. Through the package-level attributes `default_restricted_to=` and
793 `default_compatible_with=`.
7944. Through default specifications in `environment_group()` rules. Every
795 environment belongs to a group of thematically related peers (e.g. "CPU
796 architectures", "JDK versions" or "mobile operating systems"). The
797 definition of an environment group includes which of these environments
798 should be supported by "default" if not otherwise specified by the
799 `restricted_to=` / `environment()` attributes. A rule with no such
800 attributes inherits all defaults.
8015. Through a rule class default. This overrides global defaults for all
802 instances of the given rule class. This can be used, for example, to make
803 all `*_test` rules testable without each instance having to explicitly
804 declare this capability.
805
806`environment()` is implemented as a regular rule whereas `environment_group()`
807is both a subclass of `Target` but not `Rule` (`EnvironmentGroup`) and a
808function that is available by default from Starlark
809(`StarlarkLibrary.environmentGroup()`) which eventually creates an eponymous
810target. This is to avoid a cyclic dependency that would arise because each
811environment needs to declare the environment group it belongs to and each
812environment group needs to declare its default environments.
813
814A build can be restricted to a certain environment with the
815`--target_environment` command line option.
816
817The implementation of the constraint check is in
818`RuleContextConstraintSemantics` and `TopLevelConstraintSemantics`.
819
820#### Platform constraints
821
822The current "official" way to describe what platforms a target is compatible
823with is by using the same constraints used to describe toolchains and platforms.
824It's under review in pull request
825[#10945](https://github.com/bazelbuild/bazel/pull/10945).
826
827### Visibility
828
829If you work on a large codebase with a lot of developers (like at Google), you
830don't necessarily want everyone else to be able to depend on your code so that
831you retain the liberty to change things that you deem to be implementation
832details (otherwise, as per [Hyrum's law](https://www.hyrumslaw.com/), people
833_will_ come to depend on all parts of your code).
834
835Bazel supports this by the mechanism called _visibility: _you can declare that a
836particular rule can only be depended on using the visibility attribute
837(documentation
838[here](https://docs.bazel.build/versions/master/be/common-definitions.html#common-attributes)).
839This attribute is a little special because unlike every other attribute, the set
840of dependencies it generates is not simply the set of labels listed (yes, this
841is a design flaw).
842
843This is implemented in the following places:
844
845* The `RuleVisibility` interface represents a visibility declaration. It can
846 be either a constant (fully public or fully private) or a list of labels.
847* Labels can refer to either package groups (predefined list of packages), to
848 packages directly (`//pkg:__pkg__`) or subtrees of packages
849 (`//pkg:__subpackages__`). This is different from the command line syntax,
850 which uses `//pkg:*` or `//pkg/...`.
851* Package groups are implemented as their own target and configured target
852 types (`PackageGroup` and `PackageGroupConfiguredTarget`). We could probably
853 replace these with simple rules if we wanted to.
854* The conversion from visibility label lists to dependencies is done in
855 `DependencyResolver.visitTargetVisibility` and a few other miscellaneous
856 places.
857* The actual check is done in
858 `CommonPrerequisiteValidator.validateDirectPrerequisiteVisibility()`
859
860### Nested sets
861
862Oftentimes, a configured target aggregates a set of files from its dependencies,
863adds its own, and wraps the aggregate set into a transitive info provider so
864that configured targets that depend on it can do the same. Examples:
865
866* The C++ header files used for a build
867* The object files that represent the transitive closure of a `cc_library`
868* The set of .jar files that need to be on the classpath for a Java rule to
869 compile or run
870* The set of Python files in the transitive closure of a Python rule
871
872If we did this the naive way by using e.g. `List` or `Set`, we'd end up with
873quadratic memory usage: if there is a chain of N rules and each rule adds a
874file, we'd have 1+2+...+N collection members.
875
876In order to get around this problem, we came up with the concept of a
877`NestedSet`. It's a data structure that is composed of other `NestedSet`
878instances and some members of its own, thereby forming a directed acyclic graph
879of sets. They are immutable and their members can be iterated over. We define
880multiple iteration order (`NestedSet.Order`): preorder, postorder, topological
881(a node always comes after its ancestors) and "don't care, but it should be the
882same each time".
883
884The same data structure is called `depset` in Starlark.
885
886### Artifacts and Actions
887
888The actual build consists of a set of commands that need to be run to produce
889the output the user wants. The commands are represented as instances of the
890class `Action` and the files are represented as instances of the class
891`Artifact`. They are arranged in a bipartite, directed, acyclic graph called the
892"action graph".
893
894Artifacts come in two kinds: source artifacts (i.e. ones that are available
895before Bazel starts executing) and derived artifacts (ones that need to be
896built). Derived artifacts can themselves be multiple kinds:
897
8981. **Regular artifacts. **These are checked for up-to-dateness by computing
899 their checksum, with mtime as a shortcut; we don't checksum the file if its
900 ctime hasn't changed.
9012. **Unresolved symlink artifacts.** These are checked for up-to-dateness by
902 calling readlink(). Unlike regular artifacts, these can be dangling
903 symlinks. Usually used in cases where one then packs up some files into an
904 archive of some sort.
9053. **Tree artifacts.** These are not single files, but directory trees. They
906 are checked for up-to-dateness by checking the set of files in it and their
907 contents. They are represented as a `TreeArtifact`.
9084. **Constant metadata artifacts.** Changes to these artifacts don't trigger a
909 rebuild. This is used exclusively for build stamp information: we don't want
910 to do a rebuild just because the current time changed.
911
912There is no fundamental reason why source artifacts cannot be tree artifacts or
913unresolved symlink artifacts, it's just that we haven't implemented it yet (we
914should, though -- referencing a source directory in a BUILD file is one of the
915few known long-standing incorrectness issues with Bazel; we have an
916implementation that kind of works which is enabled by the
917`BAZEL_TRACK_SOURCE_DIRECTORIES=1` JVM property)
918
919A notable kind of `Artifact` are middlemen. They are indicated by `Artifact`
920instances that are the outputs of `MiddlemanAction`. They are used to
921special-case some things:
922
923* Aggregating middlemen are used to group artifacts together. This is so that
924 if a lot of actions use the same large set of inputs, we don't have N\*M
925 dependency edges, only N+M (they are being replaced with nested sets)
926* Scheduling dependency middlemen ensure that an action runs before another.
927 They are mostly used for linting but also for C++ compilation (see
928 `CcCompilationContext.createMiddleman()` for an explanation)
929* Runfiles middlemen are used to ensure the presence of a runfiles tree so
930 that one does not separately need to depend on the output manifest and every
931 single artifact referenced by the runfiles tree.
932
933Actions are best understood as a command that needs to be run, the environment
934it needs and the set of outputs it produces. The following things are the main
935components of the description of an action:
936
937* The command line that needs to be run
938* The input artifacts it needs
939* The environment variables that need to be set
940* Annotations that describe the environment (e.g. platform) it needs to run in
941 \
942
943There are also a few other special cases, like writing a file whose content is
944known to Bazel. They are a subclass of `AbstractAction`. Most of the actions are
945a `SpawnAction` or a `StarlarkAction` (the same, they should arguably not be
946separate classes), although Java and C++ have their own action types
947(`JavaCompileAction`, `CppCompileAction` and `CppLinkAction`).
948
949We eventually want to move everything to `SpawnAction`; `JavaCompileAction` is
950pretty close, but C++ is a bit of a special-case due to .d file parsing and
951include scanning.
952
953The action graph is mostly "embedded" into the Skyframe graph: conceptually, the
954execution of an action is represented as an invocation of
955`ActionExecutionFunction`. The mapping from an action graph dependency edge to a
956Skyframe dependency edge is described in
957`ActionExecutionFunction.getInputDeps()` and `Artifact.key()` and has a few
958optimizations in order to keep the number of Skyframe edges low:
959
960* Derived artifacts do not have their own `SkyValue`s. Instead,
961 `Artifact.getGeneratingActionKey()` is used to find out the key for the
962 action that generates it
963* Nested sets have their own Skyframe key.
964
965### Shared actions
966
967Some actions are generated by multiple configured targets; Starlark rules are
968more limited since they are only allowed to put their derived actions into a
969directory determined by their configuration and their package (but even so,
970rules in the same package can conflict), but rules implemented in Java can put
971derived artifacts anywhere.
972
973This is considered to be a misfeature, but getting rid of it is really hard
974because it produces significant savings in execution time when e.g. a source
975file needs to be processed somehow and that file is referenced by multiple rules
976(handwave-handwave). This comes at the cost of some RAM: each instance of a
977shared action needs to be stored in memory separately.
978
979If two actions generate the same output file, they must be exactly the same:
980have the same inputs, the same outputs and run the same command line. This
981equivalence relation is implemented in `Actions.canBeShared()` and it is
982verified between the analysis and execution phases by looking at every Action.
983This is implemented in `SkyframeActionExecutor.findAndStoreArtifactConflicts()`
984and is one of the few places in Bazel that requires a "global" view of the
985build.
986
987## The execution phase
988
989This is when Bazel actually starts running build actions, i.e. commands that
990produce outputs.
991
992The first thing Bazel does after the analysis phase is to determine what
993Artifacts need to be built. The logic for this is encoded in
994`TopLevelArtifactHelper`; roughly speaking, it's the `filesToBuild` of the
995configured targets on the command line and the contents of a special output
996group for the explicit purpose of expressing "if this target is on the command
997line, build these artifacts".
998
999The next step is creating the execution root. Since Bazel has the option to read
1000source packages from different locations in the file system (`--package_path`),
1001it needs to provide locally executed actions with a full source tree. This is
1002handled by the class `SymlinkForest` and works by taking note of every target
1003used in the analysis phase and building up a single directory tree that symlinks
1004every package with a used target from its actual location. An alternative would
1005be to pass the correct paths to commands (taking `--package_path` into account).
1006This is undesirable because:
1007
1008* It changes action command lines when a package is moved from a package path
1009 entry to another (used to be a common occurrence)
1010* It results in different command lines if an action is run remotely than if
1011 it's run locally
1012* It requires a command line transformation specific to the tool in use
1013 (consider the difference between e.g. Java classpaths and C++ include paths)
1014* Changing the command line of an action invalidates its action cache entry
1015* `--package_path` is slowly and steadily being deprecated
1016
1017Then, Bazel starts traversing the action graph (the bipartite, directed graph
1018composed of actions and their input and output artifacts) and running actions.
1019The execution of each action is represented by an instance of the `SkyValue`
1020class `ActionExecutionValue`.
1021
1022Since running an action is expensive, we have a few layers of caching that can
1023be hit behind Skyframe:
1024
1025* `ActionExecutionFunction.stateMap` contains data to make Skyframe restarts
1026 of `ActionExecutionFunction` cheap
1027* The local action cache contains data about the state of the file system
1028* Remote execution systems usually also contain their own cache
1029
1030### The local action cache
1031
1032This cache is another layer that sits behind Skyframe; even if an action is
1033re-executed in Skyframe, it can still be a hit in the local action cache. It
1034represents the state of the local file system and it's serialized to disk which
1035means that when one starts up a new Bazel server, one can get local action cache
1036hits even though the Skyframe graph is empty.
1037
1038This cache is checked for hits using the method
1039`ActionCacheChecker.getTokenIfNeedToExecute()` .
1040
1041Contrary to its name, it's a map from the path of a derived artifact to the
1042action that emitted it. The action is described as:
1043
10441. The set of its input and output files and their checksum
10452. Its "action key", which is usually the command line that was executed, but
1046 in general, represents everything that's not captured by the checksum of the
1047 input files (e.g. for `FileWriteAction`, it's the checksum of the data
1048 that's written)
1049
1050There is also a highly experimental “top-down action cache” that is still under
1051development, which uses transitive hashes to avoid going to the cache as many
1052times.
1053
1054### Input discovery and input pruning
1055
1056Some actions are more complicated than just having a set of inputs. Changes to
1057the set of inputs of an action come in two forms:
1058
1059* An action may discover new inputs before its execution or decide that some
1060 of its inputs are not actually necessary. The canonical example is C++,
1061 where it's better to make an educated guess about what header files a C++
1062 file uses from its transitive closure so that we don't heed to send every
1063 file to remote executors; therefore, we have an option not to register every
1064 header file as an "input", but scan the source file for transitively
1065 included headers and only mark those header files as inputs that are
1066 mentioned in `#include` statements (we overestimate so that we don't need to
1067 implement a full C preprocessor)
1068* An action may realize that some files were not used during its execution. In
1069 C++, this is called ".d files": the compiler tells which header files were
1070 used after the fact, and in order to avoid the embarrassment of having worse
1071 incrementality than Make, Bazel makes use of this fact. This offers a better
1072 estimate than the include scanner because it relies on the compiler.
1073
1074These are implemented using methods on Action:
1075
10761. `Action.discoverInputs()` is called. It should return a nested set of
1077 Artifacts that are determined to be required. These must be source artifacts
1078 so that there are no dependency edges in the action graph that don't have an
1079 equivalent in the configured target graph.
10802. The action is executed by calling `Action.execute()`.
10813. At the end of `Action.execute()`, the action can call
1082 `Action.updateInputs()` to tell Bazel that not all of its inputs were
1083 needed. This can result in incorrect incremental builds if a used input is
1084 reported as unused.
1085
1086When an action cache returns a hit on a fresh Action instance (e.g. created
1087after a server restart), Bazel calls `updateInputs()` itself so that the set of
1088inputs reflects the result of input discovery and pruning done before.
1089
1090Starlark actions can make use of the facility to declare some inputs as unused
1091using the `unused_inputs_list=` argument of
1092<code>[ctx.actions.run()](https://docs.bazel.build/versions/master/skylark/lib/actions.html#run)</code>.
1093
1094### Various ways to run actions: Strategies/ActionContexts
1095
1096Some actions can be run in different ways. For example, a command line can be
1097executed locally, locally but in various kinds of sandboxes, or remotely. The
1098concept that embodies this is called an `ActionContext` (or `Strategy`, since we
1099successfully went only halfway with a rename...)
1100
1101The life cycle of an action context is as follows:
1102
jingwenf8b2d3b2020-10-02 06:35:24 -070011031. When the execution phase is started, `BlazeModule` instances are asked what
laurentlb4f2991c52020-08-12 11:37:32 -07001104 action contexts they have. This happens in the constructor of
1105 `ExecutionTool`. Action context types are identified by a Java `Class`
1106 instance that refers to a sub-interface of `ActionContext` and which
1107 interface the action context must implement.
11082. The appropriate action context is selected from the available ones and is
jingwenf8b2d3b2020-10-02 06:35:24 -07001109 forwarded to `ActionExecutionContext` and `BlazeExecutor` .
laurentlb4f2991c52020-08-12 11:37:32 -070011103. Actions request contexts using `ActionExecutionContext.getContext()` and
jingwenf8b2d3b2020-10-02 06:35:24 -07001111 `BlazeExecutor.getStrategy()` (there should really be only one way to do
laurentlb4f2991c52020-08-12 11:37:32 -07001112 it…)
1113
1114Strategies are free to call other strategies to do their jobs; this is used, for
1115example, in the dynamic strategy that starts actions both locally and remotely,
1116then uses whichever finishes first.
1117
1118One notable strategy is the one that implements persistent worker processes
1119(`WorkerSpawnStrategy`). The idea is that some tools have a long startup time
1120and should therefore be reused between actions instead of starting one anew for
1121every action (This does represent a potential correctness issue, since Bazel
1122relies on the promise of the worker process that it doesn't carry observable
1123state between individual requests)
1124
1125If the tool changes, the worker process needs to be restarted. Whether a worker
1126can be reused is determined by computing a checksum for the tool used using
1127`WorkerFilesHash`. It relies on knowing which inputs of the action represent
1128part of the tool and which represent inputs; this is determined by the creator
1129of the Action: `Spawn.getToolFiles()` and the runfiles of the `Spawn` are
1130counted as parts of the tool.
1131
1132More information about strategies (or action contexts!):
1133
1134* Information about various strategies for running actions is available
1135 [here](https://jmmv.dev/2019/12/bazel-strategies.html).
1136* Information about the dynamic strategy, one where we run an action both
1137 locally and remotely to see whichever finishes first is available
1138 [here](https://jmmv.dev/series.html#Bazel%20dynamic%20execution).
1139* Information about the intricacies of executing actions locally is available
1140 [here](https://jmmv.dev/2019/11/bazel-process-wrapper.html).
1141
1142### The local resource manager
1143
1144Bazel _can_ run many actions in parallel. The number of local actions that
1145_should_ be run in parallel differs from action to action: the more resources an
1146action requires, the less instances should be running at the same time to avoid
1147overloading the local machine.
1148
1149This is implemented in the class `ResourceManager`: each action has to be
1150annotated with an estimate of the local resources it requires in the form of a
1151`ResourceSet` instance (CPU and RAM). Then when action contexts do something
1152that requires local resources, they call `ResourceManager.acquireResources()`
1153and are blocked until the required resources are available.
1154
1155A more detailed description of local resource management is available
1156[here](https://jmmv.dev/2019/12/bazel-local-resources.html).
1157
1158### The structure of the output directory
1159
1160Each action requires a separate place in the output directory where it places
1161its outputs. The location of derived artifacts is usually as follows:
1162
1163```
1164$EXECROOT/bazel-out/<configuration>/bin/<package>/<artifact name>
1165```
1166
1167How is the name of the directory that is associated with a particular
1168configuration determined? There are two conflicting desirable properties:
1169
11701. If two configurations can occur in the same build, they should have
1171 different directories so that both can have their own version of the same
1172 action; otherwise, if the two configurations disagree about e.g. the command
1173 line of an action producing the same output file, Bazel doesn't know which
1174 action to choose (an "action conflict")
11752. If two configurations represent "roughly" the same thing, they should have
1176 the same name so that actions executed in one can be reused for the other if
1177 the command lines match: for example, changes to the command line options to
1178 the Java compiler should not result in C++ compile actions being re-run.
1179
1180So far, we have not come up with a principled way of solving this problem, which
1181has similarities to the problem of configuration trimming. A longer discussion
1182of options is available
1183[here](https://docs.google.com/document/d/1fZI7wHoaS-vJvZy9SBxaHPitIzXE_nL9v4sS4mErrG4/edit).
1184The main problematic areas are Starlark rules (whose authors usually aren't
1185intimately familiar with Bazel) and aspects, which add another dimension to the
1186space of things that can produce the "same" output file.
1187
1188The current approach is that the path segment for the configuration is
1189`<CPU>-<compilation mode>` with various suffixes added so that configuration
1190transitions implemented in Java don't result in action conflicts. In addition, a
1191checksum of the set of Starlark configuration transitions is added so that users
1192can't cause action conflicts. It is far from perfect. This is implemented in
1193`OutputDirectories.buildMnemonic()` and relies on each configuration fragment
1194adding its own part to the name of the output directory.
1195
1196## Tests
1197
1198Bazel has rich support for running tests. It supports:
1199
1200* Running tests remotely (if a remote execution backend is available)
1201* Running tests multiple times in parallel (for deflaking or gathering timing
1202 data)
1203* Sharding tests (splitting test cases in same test over multiple processes
1204 for speed)
1205* Re-running flaky tests
1206* Grouping tests into test suites
1207
1208Tests are regular configured targets that have a TestProvider, which describes
1209how the test should be run:
1210
1211* The artifacts whose building result in the test being run. This is a "cache
1212 status" file that contains a serialized `TestResultData` message
1213* The number of times the test should be run
1214* The number of shards the test should be split into
1215* Some parameters about how the test should be run (e.g. the test timeout)
1216
1217### Determining which tests to run
1218
1219Determining which tests are run is an elaborate process.
1220
1221First, during target pattern parsing, test suites are recursively expanded. The
1222expansion is implemented in `TestsForTargetPatternFunction`. A somewhat
1223surprising wrinkle is that if a test suite declares no tests, it refers to
1224_every_ test in its package. This is implemented in `Package.beforeBuild()` by
1225adding an implicit attribute called `$implicit_tests` to test suite rules.
1226
1227Then, tests are filtered for size, tags, timeout and language according to the
1228command line options. This is implemented in `TestFilter` and is called from
1229`TargetPatternPhaseFunction.determineTests()` during target parsing and the
1230result is put into `TargetPatternPhaseValue.getTestsToRunLabels()`. The reason
1231why rule attributes which can be filtered for are not configurable is that this
1232happens before the analysis phase, therefore, the configuration is not
1233available.
1234
1235This is then processed further in `BuildView.createResult()`: targets whose
1236analysis failed are filtered out and tests are split into exclusive and
1237non-exclusive tests. It's then put into `AnalysisResult`, which is how
1238`ExecutionTool` knows which tests to run.
1239
1240In order to lend some transparency to this elaborate process, the `tests()`
1241query operator (implemented in `TestsFunction`) is available to tell which tests
1242are run when a particular target is specified on the command line. It's
1243unfortunately a reimplementation, so it probably deviates from the above in
1244multiple subtle ways.
1245
1246### Running tests
1247
1248The way the tests are run is by requesting cache status artifacts. This then
1249results in the execution of a `TestRunnerAction`, which eventually calls the
1250`TestActionContext` chosen by the `--test_strategy` command line option that
1251runs the test in the requested way.
1252
1253Tests are run according to an elaborate protocol that uses environment variables
1254to tell tests what's expected from them. A detailed description of what Bazel
1255expects from tests and what tests can expect from Bazel is available
1256[here](https://docs.bazel.build/versions/master/test-encyclopedia.html). At the
1257simplest, an exit code of 0 means success, anything else means failure.
1258
1259In addition to the cache status file, each test process emits a number of other
1260files. They are put in the "test log directory" which is the subdirectory called
1261`testlogs` of the output directory of the target configuration:
1262
1263* `test.xml`, a JUnit-style XML file detailing the individual test cases in
1264 the test shard
1265* `test.log`, the console output of the test. stdout and stderr are not
1266 separated.
1267* `test.outputs`, the "undeclared outputs directory"; this is used by tests
1268 that want to output files in addition to what they print to the terminal.
1269
1270There are two things that can happen during test execution that cannot during
1271building regular targets: exclusive test execution and output streaming.
1272
1273Some tests need to be executed in exclusive mode, i.e. not in parallel with
1274other tests. This can be elicited either by adding `tags=["exclusive"]` to the
1275test rule or running the test with `--test_strategy=exclusive` . Each exclusive
1276test is run by a separate Skyframe invocation requesting the execution of the
1277test after the "main" build. This is implemented in
1278`SkyframeExecutor.runExclusiveTest()`.
1279
1280Unlike regular actions, whose terminal output is dumped when the action
1281finishes, the user can request the output of tests to be streamed so that they
1282get informed about the progress of a long-running test. This is specified by the
1283`--test_output=streamed` command line option and implies exclusive test
1284execution so that outputs of different tests are not interspersed.
1285
1286This is implemented in the aptly-named `StreamedTestOutput` class and works by
1287polling changes to the `test.log` file of the test in question and dumping new
1288bytes to the terminal where Bazel rules.
1289
1290Results of the executed tests are available on the event bus by observing
1291various events (e.g. `TestAttempt`, `TestResult` or `TestingCompleteEvent`).
1292They are dumped to the Build Event Protocol and they are emitted to the console
1293by `AggregatingTestListener`.
1294
1295### Coverage collection
1296
1297Coverage is reported by the tests in LCOV format in the files
1298`bazel-testlogs/$PACKAGE/$TARGET/coverage.dat` .
1299
1300To collect coverage, each test execution is wrapped in a script called
1301`collect_coverage.sh` .
1302
1303This script sets up the environment of the test to enable coverage collection
1304and determine where the coverage files are written by the coverage runtime(s).
1305It then runs the test. A test may itself run multiple subprocesses and consist
1306of parts written in multiple different programming languages (with separate
1307coverage collection runtimes). The wrapper script is responsible for converting
1308the resulting files to LCOV format if necessary, and merges them into a single
1309file.
1310
1311The interposition of `collect_coverage.sh` is done by the test strategies and
1312requires `collect_coverage.sh` to be on the inputs of the test. This is
1313accomplished by the implicit attribute `:coverage_support` which is resolved to
1314the value of the configuration flag `--coverage_support` (see
1315`TestConfiguration.TestOptions.coverageSupport`)
1316
1317Some languages do offline instrumentation, meaning that the coverage
1318instrumentation is added at compile time (e.g. C++) and others do online
1319instrumentation, meaning that coverage instrumentation is added at execution
1320time.
1321
1322Another core concept is _baseline coverage_. This is the coverage of a library,
1323binary, or test if no code in it was run. The problem it solves is that if you
1324want to compute the test coverage for a binary, it is not enough to merge the
1325coverage of all of the tests because there may be code in the binary that is not
1326linked into any test. Therefore, what we do is to emit a coverage file for every
1327binary which contains only the files we collect coverage for with no covered
1328lines. The baseline coverage file for a target is at
1329`bazel-testlogs/$PACKAGE/$TARGET/baseline_coverage.dat` . It is also generated
1330for binaries and libraries in addition to tests if you pass the
1331`--nobuild_tests_only` flag to Bazel.
1332
1333Baseline coverage is currently broken.
1334
1335We track two groups of files for coverage collection for each rule: the set of
1336instrumented files and the set of instrumentation metadata files.
1337
1338The set of instrumented files is just that, a set of files to instrument. For
1339online coverage runtimes, this can be used at runtime to decide which files to
1340instrument. It is also used to implement baseline coverage.
1341
1342The set of instrumentation metadata files is the set of extra files a test needs
1343to generate the LCOV files Bazel requires from it. In practice, this consists of
1344runtime-specific files; for example, gcc emits .gcno files during compilation.
1345These are added to the set of inputs of test actions if coverage mode is
1346enabled.
1347
1348Whether or not coverage is being collected is stored in the
1349`BuildConfiguration`. This is handy because it is an easy way to change the test
1350action and the action graph depending on this bit, but it also means that if
1351this bit is flipped, all targets need to be re-analyzed (some languages, e.g.
1352C++ require different compiler options to emit code that can collect coverage,
1353which mitigates this issue somewhat, since then a re-analysis is needed anyway).
1354
1355The coverage support files are depended on through labels in an implicit
1356dependency so that they can be overridden by the invocation policy, which allows
1357them to differ between the different versions of Bazel. Ideally, these
1358differences would be removed, and we standardized on one of them.
1359
1360We also generate a "coverage report" which merges the coverage collected for
1361every test in a Bazel invocation. This is handled by
1362`CoverageReportActionFactory` and is called from `BuildView.createResult()` . It
1363gets access to the tools it needs by looking at the `:coverage_report_generator`
1364attribute of the first test that is executed.
1365
1366## The query engine
1367
1368Bazel has a
1369[little language](https://docs.bazel.build/versions/master/query-how-to.html)
1370used to ask it various things about various graphs. The following query kinds
1371are provided:
1372
1373* `bazel query` is used to investigate the target graph
1374* `bazel cquery` is used to investigate the configured target graph
1375* `bazel aquery` is used to investigate the action graph
1376
jingwenf8b2d3b2020-10-02 06:35:24 -07001377Each of these is implemented by subclassing `AbstractBlazeQueryEnvironment`.
laurentlb4f2991c52020-08-12 11:37:32 -07001378Additional additional query functions can be done by subclassing `QueryFunction`
1379. In order to allow streaming query results, instead of collecting them to some
1380data structure, a `query2.engine.Callback` is passed to `QueryFunction`, which
1381calls it for results it wants to return.
1382
1383The result of a query can be emitted in various ways: labels, labels and rule
1384classes, XML, protobuf and so on. These are implemented as subclasses of
1385`OutputFormatter`.
1386
1387A subtle requirement of some query output formats (proto, definitely) is that
1388Bazel needs to emit _all _the information that package loading provides so that
1389one can diff the output and determine whether a particular target has changed.
1390As a consequence, attribute values need to be serializable, which is why there
1391are only so few attribute types without any attributes having complex Starlark
1392values. The usual workaround is to use a label, and attach the complex
1393information to the rule with that label. It's not a very satisfying workaround
1394and it would be very nice to lift this requirement.
1395
1396## The module system
1397
1398Bazel can be extended by adding modules to it. Each module must subclass
jingwenf8b2d3b2020-10-02 06:35:24 -07001399`BlazeModule` (the name is a relic of the history of Bazel when it used to be
1400called Blaze) and gets information about various events during the execution of
laurentlb4f2991c52020-08-12 11:37:32 -07001401a command.
1402
1403They are mostly used to implement various pieces of "non-core" functionality
1404that only some versions of Bazel (e.g. the one we use at Google) need:
1405
1406* Interfaces to remote execution systems
1407* New commands
1408
jingwenf8b2d3b2020-10-02 06:35:24 -07001409The set of extension points `BlazeModule` offers is somewhat haphazard. Don't
laurentlb4f2991c52020-08-12 11:37:32 -07001410use it as an example of good design principles.
1411
1412## The event bus
1413
jingwenf8b2d3b2020-10-02 06:35:24 -07001414The main way BlazeModules communicate with the rest of Bazel is by an event bus
laurentlb4f2991c52020-08-12 11:37:32 -07001415(`EventBus`): a new instance is created for every build, various parts of Bazel
1416can post events to it and modules can register listeners for the events they are
1417interested in. For example, the following things are represented as events:
1418
1419* The list of build targets to be built has been determined
1420 (`TargetParsingCompleteEvent`)
1421* The top-level configurations have been determined
1422 (`BuildConfigurationEvent`)
1423* A target was built, successfully or not (`TargetCompleteEvent`)
1424* A test was run (`TestAttempt`, `TestSummary`)
1425
1426Some of these events are represented outside of Bazel in the
1427[Build Event Protocol](https://docs.bazel.build/versions/master/build-event-protocol.html)
jingwenf8b2d3b2020-10-02 06:35:24 -07001428(they are `BuildEvent`s). This allows not only `BlazeModule`s, but also things
laurentlb4f2991c52020-08-12 11:37:32 -07001429outside the Bazel process to observe the build. They are accessible either as a
1430file that contains protocol messages or Bazel can connect to a server (called
1431the Build Event Service) to stream events.
1432
1433This is implemented in the `build.lib.buildeventservice` and
1434`build.lib.buildeventstream` Java packages.
1435
1436## External repositories
1437
1438Whereas Bazel was originally designed to be used in a monorepo (a single source
1439tree containing everything one needs to build), Bazel lives in a world where
1440this is not necessarily true. "External repositories" are an abstraction used to
1441bridge these two worlds: they represent code that is necessary for the build but
1442is not in the main source tree.
1443
1444### The WORKSPACE file
1445
1446The set of external repositories is determined by parsing the WORKSPACE file.
1447For example, a declaration like this:
1448
1449```
1450 local_repository(name="foo", path="/foo/bar")
1451```
1452
1453Results in the repository called `@foo` being available. Where this gets
1454complicated is that one can define new repository rules in Starlark files, which
1455can then be used to load new Starlark code, which can be used to define new
1456repository rules and so on…
1457
1458To handle this case, the parsing of the WORKSPACE file (in
1459`WorkspaceFileFunction`) is split up into chunks delineated by `load()`
1460statements. The chunk index is indicated by `WorkspaceFileKey.getIndex()` and
1461computing `WorkspaceFileFunction` until index X means evaluating it until the
1462Xth `load()` statement.
1463
1464### Fetching repositories
1465
1466Before the code of the repository is available to Bazel, it needs to be
1467_fetched_. This results in Bazel creating a directory under
1468`$OUTPUT_BASE/external/<repository name>`.
1469
1470Fetching the repository happens in the following steps:
1471
14721. `PackageLookupFunction` realizes that it needs a repository and creates a
1473 `RepositoryName` as a `SkyKey`, which invokes `RepositoryLoaderFunction`
14742. `RepositoryLoaderFunction` forwards the request to
1475 `RepositoryDelegatorFunction` for unclear reasons (the code says it's to
1476 avoid re-downloading things in case of Skyframe restarts, but it's not a
1477 very solid reasoning)
14783. `RepositoryDelegatorFunction` finds out the repository rule it's asked to
1479 fetch by iterating over the chunks of the WORKSPACE file until the requested
1480 repository is found
14814. The appropriate `RepositoryFunction` is found that implements the repository
1482 fetching; it's either the Starlark implementation of the repository or a
1483 hard-coded map for repositories that are implemented in Java.
1484
1485There are various layers of caching since fetching a repository can be very
1486expensive:
1487
14881. There is a cache for downloaded files that is keyed by their checksum
1489 (`RepositoryCache`). This requires the checksum to be available in the
1490 WORKSPACE file, but that's good for hermeticity anyway. This is shared by
1491 every Bazel server instance on the same workstation, regardless of which
1492 workspace or output base they are running in.
14932. A "marker file" is written for each repository under `$OUTPUT_BASE/external`
1494 that contains a checksum of the rule that was used to fetch it. If the Bazel
1495 server restarts but the checksum does not change, it's not re-fetched. This
1496 is implemented in `RepositoryDelegatorFunction.DigestWriter` .
14973. The `--distdir` command line option designates another cache that is used to
1498 look up artifacts to be downloaded. This is useful in enterprise settings
1499 where Bazel should not fetch random things from the Internet. This is
1500 implemented by `DownloadManager` .
1501
1502Once a repository is downloaded, the artifacts in it are treated as source
1503artifacts. This poses a problem because Bazel usually checks for up-to-dateness
1504of source artifacts by calling stat() on them, and these artifacts are also
1505invalidated when the definition of the repository they are in changes. Thus,
1506`FileStateValue`s for an artifact in an external repository need to depend on
1507their external repository. This is handled by `ExternalFilesHelper`.
1508
1509### Managed directories
1510
1511Sometimes, external repositories need to modify files under the workspace root
1512(e.g. a package manager that houses the downloaded packages in a subdirectory of
1513the source tree). This is at odds with the assumption Bazel makes that source
1514files are only modified by the user and not by itself and allows packages to
1515refer to every directory under the workspace root. In order to make this kind of
1516external repository work, Bazel does two things:
1517
15181. Allows the user to specify subdirectories of the workspace Bazel is not
1519 allowed to reach into. They are listed in a file called `.bazelignore` and
1520 the functionality is implemented in `BlacklistedPackagePrefixesFunction`.
15212. We encode the mapping from the subdirectory of the workspace to the external
1522 repository it is handled by into `ManagedDirectoriesKnowledge` and handle
1523 `FileStateValue`s referring to them in the same way as those for regular
1524 external repositories.
1525
1526### Repository mappings
1527
1528It can happen that multiple repositories want to depend on the same repository,
1529but in different versions (this is an instance of the "diamond dependency
1530problem"). For example, if two binaries in separate repositories in the build
1531want to depend on Guava, they will presumably both refer to Guava with labels
1532starting `@guava//` and expect that to mean different versions of it.
1533
1534Therefore, Bazel allows one to re-map external repository labels so that the
1535string `@guava//` can refer to one Guava repository (e.g. `@guava1//`) in the
1536repository of one binary and another Guava repository (e.g. `@guava2//`) the the
1537repository of the other.
1538
1539Alternatively, this can also be used to **join** diamonds. If a repository
1540depends on `@guava1//`, and another depends on `@guava2//`, repository mapping
1541allows one to re-map both repositories to use a canonical `@guava//` repository.
1542
1543The mapping is specified in the WORKSPACE file as the `repo_mapping` attribute
1544of individual repository definitions. It then appears in Skyframe as a member of
1545`WorkspaceFileValue`, where it is plumbed to:
1546
1547* `Package.Builder.repositoryMapping` which is used to transform label-valued
1548 attributes of rules in the package by
1549 `RuleClass.populateRuleAttributeValues()`
1550* `Package.repositoryMapping` which is used in the analysis phase (for
1551 resolving things like `$(location)` which are not parsed in the loading
1552 phase)
1553* `SkylarkImportLookupFunction` for resolving labels in load() statements
1554
1555## JNI bits
1556
1557The server of Bazel is_ mostly _written in Java. The exception is the parts that
1558Java cannot do by itself or couldn't do by itself when we implemented it. This
1559is mostly limited to interaction with the file system, process control and
1560various other low-level things.
1561
1562The C++ code lives under src/main/native and the Java classes with native
1563methods are:
1564
1565* `NativePosixFiles` and `NativePosixFileSystem`
1566* `ProcessUtils`
1567* `WindowsFileOperations` and `WindowsFileProcesses`
1568* `com.google.devtools.build.lib.platform`
1569
1570## Console output
1571
1572Emitting console output seems like a simple thing, but the confluence of running
1573multiple processes (sometimes remotely), fine-grained caching, the desire to
1574have a nice and colorful terminal output and having a long-running server makes
1575it non-trivial.
1576
1577Right after the RPC call comes in from the client, two `RpcOutputStream`
1578instances are created (for stdout and stderr) that forward the data printed into
1579them to the client. These are then wrapped in an `OutErr` (an (stdout, stderr)
1580pair). Anything that needs to be printed on the console goes through these
1581streams. Then these streams are handed over to
jingwenf8b2d3b2020-10-02 06:35:24 -07001582`BlazeCommandDispatcher.execExclusively()`.
laurentlb4f2991c52020-08-12 11:37:32 -07001583
1584Output is by default printed with ANSI escape sequences. When these are not
1585desired (`--color=no`), they are stripped by an `AnsiStrippingOutputStream`. In
1586addition, `System.out` and `System.err` are redirected to these output streams.
1587This is so that debugging information can be printed using
1588`System.err.println()` and still end up in the terminal output of the client
1589(which is different from that of the server). Care is taken that if a process
1590produces binary output (e.g. `bazel query --output=proto`), no munging of stdout
1591takes place.
1592
1593Short messages (errors, warnings and the like) are expressed through the
1594`EventHandler` interface. Notably, these are different from what one posts to
1595the `EventBus` (this is confusing). Each `Event` has an `EventKind` (error,
1596warning, info, and a few others) and they may have a `Location` (the place in
1597the source code that caused the event to happen).
1598
1599Some `EventHandler` implementations store the events they received. This is used
1600to replay information to the UI caused by various kinds of cached processing,
1601for example, the warnings emitted by a cached configured target.
1602
1603Some `EventHandler`s also allow posting events that eventually find their way to
1604the event bus (regular `Event`s do _not _appear there). These are
1605implementations of `ExtendedEventHandler` and their main use is to replay cached
1606`EventBus` events. These `EventBus` events all implement `Postable`, but not
1607everything that is posted to `EventBus` necessarily implements this interface;
1608only those that are cached by an `ExtendedEventHandler` (it would be nice and
1609most of the things do; it's not enforced, though)
1610
1611Terminal output is _mostly_ emitted through `UiEventHandler`, which is
1612responsible for all the fancy output formatting and progress reporting Bazel
1613does. It has two inputs:
1614
1615* The event bus
1616* The event stream piped into it through Reporter
1617
1618The only direct connection the command execution machinery (i.e. the rest of
1619Bazel) has to the RPC stream to the client is through `Reporter.getOutErr()`,
1620which allows direct access to these streams. It's only used when a command needs
1621to dump large amounts of possible binary data (e.g. `bazel query`).
1622
1623## Profiling Bazel
1624
1625Bazel is fast. Bazel is also slow, because builds tend to grow until just the
1626edge of what's bearable. For this reason, Bazel includes a profiler which can be
1627used to profile builds and Bazel itself. It's implemented in a class that's
1628aptly named `Profiler`. It's turned on by default, although it records only
1629abridged data so that its overhead is tolerable; The command line
1630`--record_full_profiler_data` makes it record everything it can.
1631
1632It emits a profile in the Chrome profiler format; it's best viewed in Chrome.
1633It's data model is that of task stacks: one can start tasks and end tasks and
1634they are supposed to be neatly nested within each other. Each Java thread gets
1635its own task stack. **TODO:** How does this work with actions and
1636continuation-passing style?
1637
jingwenf8b2d3b2020-10-02 06:35:24 -07001638The profiler is started and stopped in `BlazeRuntime.initProfiler()` and
1639`BlazeRuntime.afterCommand()` respectively and attempts to be live for as long
laurentlb4f2991c52020-08-12 11:37:32 -07001640as possible so that we can profile everything. To add something to the profile,
1641call `Profiler.instance().profile()`. It returns a `Closeable`, whose closure
1642represents the end of the task. It's best used with try-with-resources
1643statements.
1644
1645We also do rudimentary memory profiling in `MemoryProfiler`. It's also always on
1646and it mostly records maximum heap sizes and GC behavior.
1647
1648## Testing Bazel
1649
1650Bazel has two main kinds of tests: ones that observe Bazel as a "black box" and
1651ones that only run the analysis phase. We call the former "integration tests"
1652and the latter "unit tests", although they are more like integration tests that
1653are, well, less integrated. We also have some actual unit tests, where they are
1654necessary.
1655
1656Of integration tests, we have two kinds:
1657
16581. Ones implemented using a very elaborate bash test framework under
1659 `src/test/shell`
16602. Ones implemented in Java. These are implemented as subclasses of
1661 `AbstractBlackBoxTest`.
1662
1663`AbstractBlackBoxTest` has the virtue that it works on Windows, too, but most of
1664our integration tests are written in bash.
1665
1666Analysis tests are implemented as subclasses of `BuildViewTestCase`. There is a
1667scratch file system you can use to write BUILD files, then various helper
1668methods can request configured targets, change the configuration and assert
1669various things about the result of the analysis.