dependency_scanning
job failed due to one of the dependencies being reported as vulnerable- Raspberry Pi jobs timed out in pending state waiting to be scheduled
- Jobs are stuck due to no runners being active
Handling broken master pipelines
We currently run nightly pipelines for
building both CE and EE package in our Release mirror.
This mirror is configured to send pipeline failure notifications to
#g_distribution
channel on Slack. A broken master pipeline gets priority over
other scheduled work as per our development guidelines.
dependency_scanning
job failed due to one of the dependencies being reported as vulnerable
-
Check the job log and find out which component is marked
Vulnerable
-
Open a confidential issue in
omnibus-gitlab
issue tracker giving details about the vulnerability and a link to the failed job. -
Label the issue with the
security
andFor Scheduling
labels. The GitLab Security team will be made aware of this issue, thanks to the automation in place by escalator. -
Once an issue has been filed, ask a Maintainer of the project to add the CVE to the
CVEIGNORE
environment variable defined in the project settings, in Release mirror. This will ensure the master pipeline won’t keep failing and flood the Slack channel with notifications while we triage the issue based on severity, and priority. -
Security team, with the help of Distribution, triages the issue and schedules it accordingly.
-
If the issue is found out to be a no-op for our usecase, open an MR adding the variable to the
.cveignore
file. -
If the issue is found out to be actionable for us, it goes through the regular scheduling process based on its severity and priority and gets necessary MRs (targeting master and relevant backport stable branches).
-
Ensure the entry is removed from the
CVEIGNORE
variable once the MRs have been merged. This handles the edge false-negative case where a vulnerability might affect multiple components and only one of them was fixed by an MR. Removing an item from the.cveignore
file can be done through a public MR to the Omnibus repository.
Raspberry Pi jobs timed out in pending state waiting to be scheduled
From time to time, we see Scaleway driver for docker-machine
failing in properly
provisioning and de-provisioning machines. THis will result in new machines not
being spun up for builds, and the jobs end up timing out waiting for a machine.
-
Follow maintenance documentation and delete all the machines that are not running.
-
If machines are present in
Off
state (gray icon), you can manually batch delete them. -
Ensure new machines are being started up.
-
Retry the failed jobs (only Maintainers can do this) and ensure it gets picked up by a machine.
Jobs are stuck due to no runners being active
This is a transient error due to connection issues between runner manager
machine and dev.gitlab.org
.
-
Sign in to runner manager machine.
-
Run the following command to force a connection between runner and GitLab
sudo gitlab-runner verify