blob: 72f46aedee01127794711ef3f05605ae8046e84c [file] [log] [blame] [view]
# How to monitor for Bazel regressions
This is a guide for the Bazel build sheriff about monitoring the Bazel CI
(Continuous Integration) projects and jobs.
# The CI dashboard
URL: https://ci.bazel.build/view/Dashboard/
The dashboard gives a quick overview of the Bazel CI's health.
We monitor:
* Bazel's own jobs (owned by the core Bazel team):
- [bazel-tests](https://ci.bazel.build/job/bazel-tests)
- [bazel-slow-tests](https://ci.bazel.build/job/bazel-slow-tests)
- [bazel-remote-tests](https://ci.bazel.build/job/bazel-remote-tests)
- [Tutorial](https://ci.bazel.build/job/Tutorial)
- [nightly](https://ci.bazel.build/job/bazel/job/nightly)
- [release](https://ci.bazel.build/job/bazel/job/release)
* Projects built using Bazel:
- repositories on the bazelbuild GitHub organisation, e.g. rules\_web
- TensorFlow
- Gerrit
- protobuf
- re2
- ...
If Bazel's own jobs are not green, the Bazel team must:
1. investigate
2. fix as soon as possible
If the other projects are not green:
1. report it to the project owners
2. deactivate the project if it stays broken for more than a week
# Triaging failures
The build sheriff should monitor the outputs of these types of jobs:
* [global tests](user.md#global-jobs)
* [postsubmits](user.md#postsubmit)
## Global tests
URLs:
* nightly: https://ci.bazel.build/job/bazel/job/nightly
* release: https://ci.bazel.build/job/bazel/job/release
When do these jobs run:
* [nightly](https://ci.bazel.build/job/bazel/job/nightly): runs every night and
can be re-run on demand using the Run button in Jenkins (you need to log in
on the Jenkins UI)
* [release](https://ci.bazel.build/job/bazel/job/release): runs at every push and
is always green for non-release pushes
How to investigate: see the [user guide](user.md#global-jobs).
When global tests fail badly:
1. [file a bug to bazelbuild/bazel](https://github.com/bazelbuild/bazel/issues/new)
2. add the "breakage" label to the bug
3. add the "release blocker" label if the breakage is on the release job
## Postsubmits
These are all the other monitored jobs.
To investigate:
How to investigate: see the [user guide](user.md#presubmit).
1. report to the project owner (e.g. Bazel team for "bazel-tests")
2. deactivate partially or totally, if a failure stays for too long