Author: laurentlb@google.com
Status: unimplemented
Related: “Beautiful error messages”
This is a followup to the document “Beautiful error messages”.
The purpose of this document is to outline a design for some plumbing that will allow the sort of errors described in that document to be emitted by Blaze for loading time BUILD
file errors.
In the “Beautiful error messages” document, four characteristics of error messages are enumerated:
Context: The erroneous text in the BUILD
file in question is shown, with a caret pointing at the exact expression in question.
Colors: Everyone loves colors.
Suggestions: A guess of how the error should be fixed is shown.
Links: Documentation is referenced directly in error messages.
This document covers (1) and (3).
In the loading phase, Bazel parses BUILD
files into a tree of [ASTNode] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/ASTNode.java) instances, with a [BuildFileAST] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/BuildFileAST.java) at the root and Statement
instances at the leaves. Each statement implements doExec, which can throw an [EvalException] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/EvalException.java). That exception is translated to the error printed to the terminal.
An EvalException
encapsulates the information given in an error. Fleshing out the contents of this exception type is a good starting point for implementing new error features.
An EvalException
contains a reference to a Location
, which encapsulates the line/character information currently specified in a Bazel error.
Generally, an ASTNode
will have a location instance that is populated by the parser as it moves from token to token ([example] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/Parser.java#L607)). Right now, since the location only contains pure syntactic information, the parser [calls into the lexer] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/Parser.java#L413) to create the location instance. While the parser perhaps could also encode AST information into the Location
, that shouldn’t be necessary to provide “context” in the form of a printed line and a carat. It seems that the lexer maintains BUILD
file info as a buffer, and should be able to parameterize Location
instances with the actual contents of the line in question. If this is true, then implementing (1) above shouldn’t involve much more than fleshing out [LocationInfo] (https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/syntax/Lexer.java#L72).
While data tracked by the lexer should be sufficient to encode an offending line from a BUILD
file into an error, providing suggestions will probably require semantic information only retrievable from the parsed AST. Furthermore, we require a mechanism in EvalException
to perform computation on AST data in order to generate suggestions.
It seems an unlikely solution to encode AST information in Location
instances, since those instances are produced by the parser before the AST, or even the AST node in question, is necessarily complete.
Instead, here are a couple proposals:
An abstract subclass of EvalException
(e.g. ContextualEvalException
) that knows how to create an error message with suggestions given unimplemented suggestion generation logic.
A further group of exceptions that are parameterized with a particular sort of ASTNode
and know how to generate suggestions. As an example, this exception type could report typos in rule names.
public class IdentifierEvalException extends ContextualEvalException { public static final Map<String, String> TYPOS = Map.of( "cclibrary", "cc_library", ect... ) public IdentifierEvalException(Identifier identifier, Location loc) {...} @Override protected Suggestion generateSuggestion() { if (TYPOS.keySet().contains(identifier.getName()) { return TypoSuggestion(identifier, TYPOS.get(identifier.getName())); } } }
Once the plumbing around (1) is in place, we can add subclasses at our leisure to provide suggestions.
Furthermore, a ContextualEvalException
can be made to have enough information to not only provide suggestions, but also to return context-aware error messages. Consider this example, from the “Beautiful error messages” document:
my_obj = select({ ":something": [1], "other": [2], }) t = [x for x in my_obj]
We can imagine a NotIterableEvalException
that knows not only about the type select
, but is also parameterized with the erroneous expression my_objc
.
The above proposal hinges on the ability to store a file pointer in the Location
object, to be dereferenced at error-time to obtain the entire BUILD
file. This opens the door to some issues:
Location
instance is stored for every node in the parse tree of every BUILD
file. Even a file pointer in each Location
may have substantial memory/speed impact.Location
object is serialized, since the AST is part of a SkyValue
. This pointer, then, must be serializable.