This design document has been replaced by Skylark Remote Repositories and is maintained here just for reference
A configuration mechanism for Bazel
Status: deprecated, replaced by Skylark Remote Repositories
Author: dmarting@google.com
Bazel tooling needs special setup to work. For example, C++ crosstool configuration requires path to GCC or Java configuration requires the path to the JDK. Autodetecting those paths from Bazel would be broken because each ruleset requires its own configuration (C++ CROSSTOOL information is totally different from JDK detection or from go root detection). Therefore, providing a general mechanism to configure Bazel tooling seems natural. To have Bazel self-contained, we will ship this mechanism as an additional command of Bazel. Because this command deals with non-hermetic parts of Bazel, this command should also group all non-hermetic steps (i.e. it should fetch the dependencies from the remote repositories) so a user can run it and get on a plane with everything needed.
We consider the 3 following use-cases:
tools
package) should be checked into the version control system.tools/jdk
should be checked into the version control system). However, the user does not want to have C++ information (i.e. tools/cpp
) in the VCS.tools
directory (Google use-case).This document addresses the special case of the configuration of the tools
package, mechanisms presented here could be extended to any dependency that needs to be configured (e.g., detecting the installed libncurse) but that is out of the scope of this document.
Anywhere in this document we refer to the tools
package as the package that will receive the current tools
package content, it does not commit to keep that package name.
bazel init
should:tools
package contains the detected tool paths (gcc
, the JDKs, ...) as well as their configuration (basically the content of the current //tools
package). This package should be constructed as much as possibly automatically with a way for the user to overwrite detected settings.tools
package is used only for linking the actual tool paths but the configuration would be checked-in into the workspace (in a similar way that what is currently done in Bazel). The “hidden” tools
package could contains several versions of the same tools (e.g., jdk-8, jdk-7, ...) and the workspace link to a specific one.bazel init
could:To be efficient, when the tools
directory is missing, bazel build
should display an informative error message to actually run bazel init
.
Configuration is basically just setting a list of build constants like the path to the JDK, the list of C++ flags, etc...
When the user type bazel init
, the configuration process starts with the default configuration (e.g., configure for “gold features” such as C++, Java, sh_, ...). It should try to autodetect as much as possible. If a language configuration needs something it cannot autodetect, then it can prompt the user for the missing information and the configuration can fail if something is really wrong.
On default installation, bazel init
should not prompt the user at all. When the process finishes, the command should output a summary of the configuration. The configuration is then stored in a “hidden” directory which is similar to our current tools
package. By default, the labels in the configuration would direct to that package (always mapped as a top-level package). The “hidden:” directory would live in $(output_base)/init/tools
and be mapped using the package path mechanism. The --overwrite
option would be needed to rerun the automatic detection and overwrite everything including the eventual user-set options.
For the hermetic mode, the user has to recreate the default tools package inside the workspace. If the user has a package with the same name in the workspace, then the “hidden” directory should be ignored (--package_path).
To set a configuration option, the user would type bazel init java:jdk=/path/to/jdk
or to use the autodetection on a specific option bazel init java:jdk
. The list of settings group could be obtained by bazel init list
and the list of option with their value for a specific language by bazel init list group
. bazel init list all
could give the full configuration of all activated groups.
Prospective idea: Bazel init should explore the BUILD file to find the Skylark load
statements, determine if there is an associated init script and use it.
This section presents the support for developer that wants to add autoconfiguration for a ruleset. The developer adding a configuration would provide with a configuration script for it. This script will be in charge of creating the package in the tools directory during bazel init
(i.e., the script for Java support will construct the //tools/jdk package in the “hidden” package path).
Because of skylark rules and the fact that the configuration script should run before having access to the C++ and Java tooling, this seems unreasonable to use a compiled language (Java or C++) for this script. We could use the Skylark support to make it a subset of python or we could use a bash script. Python support would be portable since provided by Bazel itself and consistent with skylark. It also gives immediate support for manipulating BUILD files. So keeping a “skylark-like” syntax, the interface would look like:
configuration( name, # name of the tools package to configure autodetect_method, # the auto detection method generate_method, # the actual package generation load_method, # A method to load the attributes presented # to the user from the package attrs = { # List of attributes this script propose "jdk_path": String, "__some_other_path": String, # not user-settable "jdk_version": Integer, })
Given that interface, an initial run of bazel init
would do:
load_method
for each scriptautodetect_method
for each script. Replace non loaded attribute (attribute still undefined after load_method
) if and only if --rerun
option is providedgenerate_method
for each scriptSee Appendix B for examples of such methods.
bazel init
that support the configuration for native packages in Java, that is: Java, C++, genrule and test. This would create the necessary mechanisms for supporting the developer and the basic user interface. This commands will be totally in Java for now and should trigger the fetch part of the remote repository.tools/defaults/BUILD
file./etc/bazel.bazelrc
being loaded first to ~/.bazelrc
). We could use a ~/.bazel
directory to put an updatable tools cache also. This is needed because user probably want to initialize a workspace tooling on a planetools
, tools/defaults
, visibility
, external
, condition
...). We are going to make some user unhappy if they cannot have an external
directory at the top of their workspace (I would just not use a build system that goes against my workspace structure). While it is still time to do it, we should rename them with a nice naming convention. A good way to do it is to make top-package name constants, possibly settable in the WORKSPACE file (so we can actually keep the name we like but user that are bothered by that can change it).WORKSPACE
file controls the workspace, it makes sense to have the prelude_bazel
logic in it (and the load statement should support labels so that a user can actually specify remote repository labels).This is just a quick draft, please feel free to propose improvements:
# env is the environment, attrs are the values set either from the command-line # or from loading the package def autodetect_method(env, attrs): if not attrs.java_version: # If not given in the command line nor loaded attrs.java_version = 8 if not attrs.jdk_path: if env.has("JDK_HOME"): attrs.jdk_path = env.get("JDK_HOME") elif env.os = "darwin": attrs.jdk_path = system("/usr/libexec/java_home -v 1." + attrs.java_version + "+") else: attrs.jdk_path = basename(basename(readlink(env.path.find(java)))) if not attrs.jdk_path: fail("Could not find JDK home, please set it with `bazel init java:jdk_path=/path/to/jdk`") attrs.__some_other_path = first(glob(["/usr/bin/java", "/usr/local/bin/java"])) # attrs is the list of attributes. It basically contains the list of rules # we should generate in the corresponding package. Please note # That all labels are replaced by relative ones as it should not be able # to write out of the package. def generate_method(attrs): scratch_file("BUILD.jdk", """ Content of the jdk BUILD file. """) # Create binding using local_repository. This should not lie in # the WORKSPACE file but in a separate WORKSPACE file in the hidden # directory. local_repository(name = "jdk", path = attrs.jdk_path, build_file = "BUILD.jdk") bind("@jdk//jdk", "jdk") # also add a filegroup("jdk", "//external:jdk") java_toolchain(name = "toolchain", source = attrs.java_version, target = attrs.java_version) # The magic __BAZEL_*__ variable could be set so we don’t # redownload the repository if possible. This install_target # should leverage the work already done on remote repositories. # This should build and copy the result into the tools directory with # The corresponding exports_files now. install_target(__BAZEL_REPOSITORY__, __BAZEL_VERSION__, "//src/java_tools/buildjar:JavaBuilder_deploy.jar") install_target(__BAZEL_REPOSITORY__, __BAZEL_VERSION__, "//src/java_tools/buildjar:JavaBuilder_deploy.jar") copy("https://ijar_url", "ijar") # Load the package attributes. # - attrs should be written and value will be replaced by the user-provided # one if any # - query is a query object restricted to the target package and resolving label # relatively to the target package. This object should also be able to search # for repository binding # Note that the query will resolve in the actual tools directory, not the hidden # one if it exists whereas the generation only happens in the hidden one. def load_method(attrs, query): java_toolchain = query.getOne(kind("java_toolchain", "...")) if java_toolchain: attrs.jdk_version = max(java_toolchain.source, java_toolchain.target) jdk = query.getOne(attr("name", "jdk", kind("local_repository", "..."))) if jdk: attrs.jdk_path = jdk.path