Blame - site/docs/creating-workers.md - bazel

blob: 232d515f59fb6ca0fb440162fdcadc0b50fb552b [file] [log] [blame] [view]

steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	1	---
				2	layout: documentation
				3	title: Creating persistent workers
				4	---
				5
Googler	458daaa	2021-01-08 18:55:00 -0800	[diff] [blame]	6	# Creating Persistent Workers
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	7
				8	[Persistent workers](persistent-workers.html) can make your build faster.
				9	If you have repeated actions in your build that have a high startup cost or
				10	would benefit from cross-action caching, you may want to implement your own
				11	persistent worker to perform these actions.
				12
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	13	The Bazel server communicates with the worker using `stdin`/`stdout`. It
				14	supports the use of protocol buffers or JSON strings. Support for JSON is
				15	experimental and thus subject to change. It is guarded behind the
				16	`--experimental_worker_allow_json_protocol` flag.
				17
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	18	The worker implementation has two parts:
				19
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	20	* The [worker](#making-the-worker).
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	21	* The [rule that uses the worker](#making-the-rule-that-uses-the-worker).
				22
				23	## Making the worker
				24
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	25	A persistent worker upholds a few requirements:
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	26
				27	* It reads [WorkRequests](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L35)
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	28	from its `stdin`.
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	29	* It writes [WorkResponses](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L49)
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	30	(and only `WorkResponse`s) to its `stdout`.
				31	* It accepts the `--persistent_worker` flag. The wrapper must recognize the
				32	`--persistent_worker` command-line flag and only make itself persistent if
				33	that flag is passed, otherwise it must do a one-shot compilation and exit.
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	34
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	35	If your program upholds these requirements, it can be used as a persistent worker!
				36
				37
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	38
				39	### Work requests
				40
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	41	A `WorkRequest` contains a list of arguments to the worker, a list of
				42	path-digest pairs representing the inputs the worker can access (this isn’t
				43	enforced, but you can use this info for caching), and a request id, which is 0
				44	for singleplex workers.
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	45
				46	```json
				47	{
				48	“args” : [“--some_argument”],
				49	“inputs” : [
				50	{ “/path/to/my/file/1” : “fdk3e2ml23d”},
				51	{ “/path/to/my/file/2” : “1fwqd4qdd” }
				52	],
				53	“request_id” : 12
				54	}
				55	```
				56
				57	### Work responses
				58
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	59	A `WorkResponse` contains a request id, a zero or nonzero exit
				60	code, and an output string that describes any errors encountered in processing
				61	or executing the request. The `output` field contains a short
				62	description; complete logs may be written to the worker's `stderr`. Because
				63	workers may only write `WorkResponses` to `stdout`, it's common for the worker
				64	to redirect the `stdout` of any tools it uses to `stderr`.
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	65
				66	```json
				67	{
				68	“exit_code” : 1,
				69	“output” : “Action failed with the following message:\nCould not find input
				70	file “/path/to/my/file/1”,
				71	“request_id” : 12
				72	}
				73	```
				74
				75	As per the norm for protobufs, the fields are optional. However, Bazel requires
				76	the `WorkRequest` and the corresponding `WorkResponse`, to have the same request
				77	id, so the request id must be specified if it is nonzero. This is a valid
				78	`WorkResponse`.
				79
				80	```json
				81	{
				82	“request_id” : 12,
				83	}
				84	```
				85
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	86	Notes
				87
				88	* Each protocol buffer is preceded by its length in `varint` format (see
				89	[`MessageLite.writeDelimitedTo()`](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo-java.io.OutputStream-).
				90	* JSON requests and responses are not preceded by a size indicator.
				91	* JSON requests uphold the same structure as the protobuf, but use standard
				92	JSON.
				93	* Bazel stores requests as protobufs and converts them to JSON using
				94	[protobuf's JSON format](https://cs.opensource.google/protobuf/protobuf/+/master:java/util/src/main/java/com/google/protobuf/util/JsonFormat.java)
				95
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	96	## Making the rule that uses the worker
				97
				98	You'll also need to create a rule that generates actions to be performed by the
				99	worker. Making a Starlark rule that uses a worker is just like [creating any other rule](https://github.com/bazelbuild/examples/tree/master/rules).
				100
				101	In addition, the rule needs to contain a reference to the worker itself, and
				102	there are some requirements for the actions it produces.
				103
				104	### Referring to the worker
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	105	The rule that uses the worker needs to contain a field that refers to the worker
				106	itself, so you'll need to create an instance of a `\*\_binary` rule to define
				107	your worker. If your worker is called `MyWorker.Java`, this might be the
				108	associated rule:
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	109
				110	```python
				111	java_binary(
				112	name = “worker”,
				113	srcs = [“MyWorker.Java”],
				114	)
				115	```
				116
				117	This creates the "worker" label, which refers to the worker binary. You'll then
				118	define a rule that uses the worker. This rule should define an attribute that
				119	refers to the worker binary.
				120
				121	If the worker binary you built is in a package named "work", which is at the top
				122	level of the build, this might be the attribute definition:
				123
				124	```python
				125	"worker": attr.label(
				126	default = Label("//work:worker"),
				127	executable = True,
				128	cfg = "host",
				129	)
				130	```
				131
				132	`cfg = "host"` indicates that the worker should be built to run on your host
				133	platform.
				134
				135	### Work action requirements
				136
				137	The rule that uses the worker creates actions for the worker to perform. These
				138	actions have a couple of requirements.
				139
				140
				141	* The _“arguments”_ field. This takes a list of strings, all but the last
				142	of which are arguments passed to the worker upon startup. The last element in
				143	the “arguments” list is a `flag-file` (@-preceded) argument. Workers read
				144	the arguments from the specified flagfile on a per-WorkRequest basis. Your
				145	rule can write non-startup arguments for the worker to this flagfile.
				146
				147	* The _“execution-requirements”_ field, which takes a dictionary containing
				148	`“supports-workers” : “1”`, `“supports-multiplex-workers” : “1”`, or both.
				149
				150	The "arguments" and "execution-requirements" fields are required for all
				151	actions sent to workers. Additionally, actions that should be executed by
				152	JSON workers need to include `“requires-worker-protocol” : “json”` in the
				153	execution requirements field. `“requires-worker-protocol” : “proto”` is also
				154	a valid execution requirement, though it’s not required for proto workers,
				155	since they are the default.
				156
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	157	You can also set a `worker-key-mnemonic` in the execution requirements. This
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	158	may be useful if you're reusing the executable for multiple action types and
				159	want to distinguish actions by this worker.
				160
				161	* Temporary files generated in the course of the action should be saved to the
				162	worker's directory. This enables sandboxing.
				163
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	164
				165	Note: To pass an argument starting with a literal `@`, start the argument
				166	with `@@` instead. If an argument is also an external repository label, it will
				167	not be considered a flagfile argument.
				168
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	169	Assuming a rule definition with "worker" attribute described above, in addition
				170	to a "srcs" attribute representing the inputs, an "output" attribute
				171	representing the outputs, and an "args" attribute representing the worker
				172	startup args, the call to `ctx.actions.run` might be:
				173
				174	```python
				175	ctx.actions.run(
				176	inputs=ctx.files.srcs,
				177	outputs=[ctx.attr.output],
				178	executable=ctx.attr.worker,
				179	mnemonic="someMnemonic",
				180	execution_requirements={
				181	“supports-workers” : “1”,
				182	“requires-worker-protocol” : “json},
				183	arguments=ctx.attr.args + [“@flagfile”]
				184	)
				185	```
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	186
				187	For another example, see [Implementing persistent workers](persistent-workers.html#implementation).
				188
steinman	38835eb	2020-11-11 14:19:56 -0800	[diff] [blame]	189	## Examples
				190
				191	The Bazel code base uses [Java compiler workers](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/BazelJavaBuilder.java),
				192	in addition to an [example JSON worker](https://github.com/bazelbuild/bazel/blob/c65f768fec9889bbf1ee934c61d0dc061ea54ca2/src/test/java/com/google/devtools/build/lib/worker/ExampleWorker.java) that is used in our integration tests.
				193
				194	You can use their [scaffolding](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/main/java/com/google/devtools/build/lib/worker/WorkRequestHandler.java) to make any Java-based tool into a worker by passing in the correct
				195	callback.
				196
				197	For an example of a rule that uses a worker, take a look at Bazel's
				198	[worker integration test](https://github.com/bazelbuild/bazel/blob/22b4dbcaf05756d506de346728db3846da56b775/src/test/shell/integration/bazel_worker_test.sh#L106).
				199
ranjanih	4962847	2020-11-18 17:57:35 -0800	[diff] [blame]	200	External contributors have implemented workers in a variety of languages; take a
				201	look at [Polyglot implementations of Bazel persistent workers](https://github.com/Ubehebe/bazel-worker-examples).
				202	You can [find many more examples on GitHub](https://github.com/search?q=bazel+workrequest&type=Code)!