blob: 232d515f59fb6ca0fb440162fdcadc0b50fb552b [file] [log] [blame] [view]
steinman38835eb2020-11-11 14:19:56 -08001---
2layout: documentation
3title: Creating persistent workers
4---
5
Googler458daaa2021-01-08 18:55:00 -08006# Creating Persistent Workers
steinman38835eb2020-11-11 14:19:56 -08007
8[Persistent workers](persistent-workers.html) can make your build faster.
9If you have repeated actions in your build that have a high startup cost or
10would benefit from cross-action caching, you may want to implement your own
11persistent worker to perform these actions.
12
ranjanih49628472020-11-18 17:57:35 -080013The Bazel server communicates with the worker using `stdin`/`stdout`. It
14supports the use of protocol buffers or JSON strings. Support for JSON is
15experimental and thus subject to change. It is guarded behind the
16`--experimental_worker_allow_json_protocol` flag.
17
steinman38835eb2020-11-11 14:19:56 -080018The worker implementation has two parts:
19
ranjanih49628472020-11-18 17:57:35 -080020* The [worker](#making-the-worker).
steinman38835eb2020-11-11 14:19:56 -080021* The [rule that uses the worker](#making-the-rule-that-uses-the-worker).
22
23## Making the worker
24
ranjanih49628472020-11-18 17:57:35 -080025A persistent worker upholds a few requirements:
steinman38835eb2020-11-11 14:19:56 -080026
27* It reads [WorkRequests](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L35)
ranjanih49628472020-11-18 17:57:35 -080028 from its `stdin`.
steinman38835eb2020-11-11 14:19:56 -080029* It writes [WorkResponses](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L49)
ranjanih49628472020-11-18 17:57:35 -080030 (and only `WorkResponse`s) to its `stdout`.
31* It accepts the `--persistent_worker` flag. The wrapper must recognize the
32 `--persistent_worker` command-line flag and only make itself persistent if
33 that flag is passed, otherwise it must do a one-shot compilation and exit.
steinman38835eb2020-11-11 14:19:56 -080034
ranjanih49628472020-11-18 17:57:35 -080035If your program upholds these requirements, it can be used as a persistent worker!
36
37
steinman38835eb2020-11-11 14:19:56 -080038
39### Work requests
40
ranjanih49628472020-11-18 17:57:35 -080041A `WorkRequest` contains a list of arguments to the worker, a list of
42path-digest pairs representing the inputs the worker can access (this isnt
43enforced, but you can use this info for caching), and a request id, which is 0
44for singleplex workers.
steinman38835eb2020-11-11 14:19:56 -080045
46```json
47{
48 “args” : [“--some_argument”],
49 “inputs” : [
50 { “/path/to/my/file/1” : “fdk3e2ml23d”},
51 { “/path/to/my/file/2” : “1fwqd4qdd” }
52 ],
53 “request_id” : 12
54}
55```
56
57### Work responses
58
ranjanih49628472020-11-18 17:57:35 -080059A `WorkResponse` contains a request id, a zero or nonzero exit
60code, and an output string that describes any errors encountered in processing
61or executing the request. The `output` field contains a short
62description; complete logs may be written to the worker's `stderr`. Because
63workers may only write `WorkResponses` to `stdout`, it's common for the worker
64to redirect the `stdout` of any tools it uses to `stderr`.
steinman38835eb2020-11-11 14:19:56 -080065
66```json
67{
68 “exit_code” : 1,
69 “output” : “Action failed with the following message:\nCould not find input
70 file “/path/to/my/file/1”,
71 “request_id” : 12
72}
73```
74
75As per the norm for protobufs, the fields are optional. However, Bazel requires
76the `WorkRequest` and the corresponding `WorkResponse`, to have the same request
77id, so the request id must be specified if it is nonzero. This is a valid
78`WorkResponse`.
79
80```json
81{
82 “request_id” : 12,
83}
84```
85
ranjanih49628472020-11-18 17:57:35 -080086**Notes**
87
88* Each protocol buffer is preceded by its length in `varint` format (see
89[`MessageLite.writeDelimitedTo()`](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo-java.io.OutputStream-).
90* JSON requests and responses are not preceded by a size indicator.
91* JSON requests uphold the same structure as the protobuf, but use standard
92 JSON.
93* Bazel stores requests as protobufs and converts them to JSON using
94[protobuf's JSON format](https://cs.opensource.google/protobuf/protobuf/+/master:java/util/src/main/java/com/google/protobuf/util/JsonFormat.java)
95
steinman38835eb2020-11-11 14:19:56 -080096## Making the rule that uses the worker
97
98You'll also need to create a rule that generates actions to be performed by the
99worker. Making a Starlark rule that uses a worker is just like [creating any other rule](https://github.com/bazelbuild/examples/tree/master/rules).
100
101In addition, the rule needs to contain a reference to the worker itself, and
102there are some requirements for the actions it produces.
103
104### Referring to the worker
ranjanih49628472020-11-18 17:57:35 -0800105The rule that uses the worker needs to contain a field that refers to the worker
106itself, so you'll need to create an instance of a `\*\_binary` rule to define
107your worker. If your worker is called `MyWorker.Java`, this might be the
108associated rule:
steinman38835eb2020-11-11 14:19:56 -0800109
110```python
111java_binary(
112 name = “worker”,
113 srcs = [“MyWorker.Java”],
114)
115```
116
117This creates the "worker" label, which refers to the worker binary. You'll then
118define a rule that *uses* the worker. This rule should define an attribute that
119refers to the worker binary.
120
121If the worker binary you built is in a package named "work", which is at the top
122level of the build, this might be the attribute definition:
123
124```python
125"worker": attr.label(
126 default = Label("//work:worker"),
127 executable = True,
128 cfg = "host",
129)
130```
131
132`cfg = "host"` indicates that the worker should be built to run on your host
133platform.
134
135### Work action requirements
136
137The rule that uses the worker creates actions for the worker to perform. These
138actions have a couple of requirements.
139
140
141* The _arguments_ field. This takes a list of strings, all but the last
142 of which are arguments passed to the worker upon startup. The last element in
143 the arguments list is a `flag-file` (@-preceded) argument. Workers read
144 the arguments from the specified flagfile on a per-WorkRequest basis. Your
145 rule can write non-startup arguments for the worker to this flagfile.
146
147* The _execution-requirements_ field, which takes a dictionary containing
148 `“supports-workers” : “1”`, `“supports-multiplex-workers” : “1”`, or both.
149
150 The "arguments" and "execution-requirements" fields are required for all
151 actions sent to workers. Additionally, actions that should be executed by
152 JSON workers need to include `“requires-worker-protocol” : “json”` in the
153 execution requirements field. `“requires-worker-protocol” : “proto”` is also
154 a valid execution requirement, though its not required for proto workers,
155 since they are the default.
156
ranjanih49628472020-11-18 17:57:35 -0800157 You can also set a `worker-key-mnemonic` in the execution requirements. This
steinman38835eb2020-11-11 14:19:56 -0800158 may be useful if you're reusing the executable for multiple action types and
159 want to distinguish actions by this worker.
160
161* Temporary files generated in the course of the action should be saved to the
162 worker's directory. This enables sandboxing.
163
ranjanih49628472020-11-18 17:57:35 -0800164
165**Note**: To pass an argument starting with a literal `@`, start the argument
166with `@@` instead. If an argument is also an external repository label, it will
167not be considered a flagfile argument.
168
steinman38835eb2020-11-11 14:19:56 -0800169Assuming a rule definition with "worker" attribute described above, in addition
170to a "srcs" attribute representing the inputs, an "output" attribute
171representing the outputs, and an "args" attribute representing the worker
172startup args, the call to `ctx.actions.run` might be:
173
174```python
175ctx.actions.run(
176 inputs=ctx.files.srcs,
177 outputs=[ctx.attr.output],
178 executable=ctx.attr.worker,
179 mnemonic="someMnemonic",
180 execution_requirements={
181 “supports-workers” : “1”,
182 “requires-worker-protocol” : “json},
183 arguments=ctx.attr.args + [“@flagfile”]
184 )
185```
ranjanih49628472020-11-18 17:57:35 -0800186
187For another example, see [Implementing persistent workers](persistent-workers.html#implementation).
188
steinman38835eb2020-11-11 14:19:56 -0800189## Examples
190
191The Bazel code base uses [Java compiler workers](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/BazelJavaBuilder.java),
192in addition to an [example JSON worker](https://github.com/bazelbuild/bazel/blob/c65f768fec9889bbf1ee934c61d0dc061ea54ca2/src/test/java/com/google/devtools/build/lib/worker/ExampleWorker.java) that is used in our integration tests.
193
194You can use their [scaffolding](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/main/java/com/google/devtools/build/lib/worker/WorkRequestHandler.java) to make any Java-based tool into a worker by passing in the correct
195callback.
196
197For an example of a rule that uses a worker, take a look at Bazel's
198[worker integration test](https://github.com/bazelbuild/bazel/blob/22b4dbcaf05756d506de346728db3846da56b775/src/test/shell/integration/bazel_worker_test.sh#L106).
199
ranjanih49628472020-11-18 17:57:35 -0800200External contributors have implemented workers in a variety of languages; take a
201look at [Polyglot implementations of Bazel persistent workers](https://github.com/Ubehebe/bazel-worker-examples).
202You can [find many more examples on GitHub](https://github.com/search?q=bazel+workrequest&type=Code)!