- Problem statement
- Keeping the CI mirrored tables in sync with the source tables
- Consistency Checking
CI mirrored tables
Problem statement
As part of the database decomposition work,
which had the goal of splitting the single database GitLab is using, into two databases: main
and
ci
, came the big challenge of
removing all joins between the main
and the ci
tables.
That is because PostgreSQL doesn’t support joins between tables that belong to different databases.
However, some core application models in the main database are queried very often by the CI side.
For example:
-
Namespace
, in thenamespaces
table. -
Project
, in theprojects
table.
Not being able to do joins
on these tables brings a great challenge. The team chose to perform logical
replication of those tables from the main database to the CI database, in the new tables:
-
ci_namespace_mirrors
, as a mirror of thenamespaces
table -
ci_project_mirrors
, as a mirror of theprojects
table
This logical replication means two things:
- The
main
database tables can be queried and joined to thenamespaces
andprojects
tables. - The
ci
database tables can be joined with theci_namespace_mirrors
andci_project_mirrors
tables.
This replication was restricted only to a few attributes that are needed from each model:
- From
Namespace
we replicatetraversal_ids
. - From
Project
we replicate only thenamespace_id
, which represents the group which the project belongs to.
Keeping the CI mirrored tables in sync with the source tables
We must care about two type 3 events to keep the source and the target tables in sync:
- Creation of new namespaces or projects.
- Updating the namespaces or projects.
- Deleting namespaces/projects.
Create and update
Syncing the data of newly created or updated namespaces or projects happens in this order:
-
On the
main
database: AnyINSERT
orUPDATE
on thenamespaces
orprojects
tables adds an entry to the tablesnamespaces_sync_events
, andprojects_sync_events
. These tables also exist on themain
database. These entries are added by triggers on both of the tables. -
On the model level: After a commit happens on either of the source models
Namespace
orProject
, it schedules the corresponding Sidekiq jobsNamespaces::ProcessSyncEventsWorker
orProjects::ProcessSyncEventsWorker
to run. - These workers then:
- Read the entries from the tables
(namespaces/project)_sync_events
from themain
database, to check which namespaces or projects to sync. - Copy the data for any updated records into the target
tables
ci_namespace_mirrors
,ci_project_mirrors
.
- Read the entries from the tables
Delete
When any of namespaces
or projects
are deleted, the target records on the mirrored
CI tables are deleted using the loose foreign keys (LFK) mechanism.
By having these items in the config/gitlab_loose_foreign_keys.yml
, the LFK mechanism
was already working as expected. It deleted any records on the CI mirrored
tables that mapped to deleted namespaces
or projects
in the main
database.
ci_namespace_mirrors:
- table: namespaces
column: namespace_id
on_delete: async_delete
ci_project_mirrors:
- table: projects
column: project_id
on_delete: async_delete
Consistency Checking
To make sure that both syncing mechanisms work as expected, we deploy two extra worker jobs, triggered by cron jobs every few minutes:
-
Database::CiNamespaceMirrorsConsistencyCheckWorker
-
Database::CiProjectMirrorsConsistencyCheckWorker
These jobs:
- Scan both of the source tables on the
main
database, using a cursor. - Compare the items in the
namespaces
andprojects
with the target tables on theci
database. - Report the items that are not in sync to Kibana and Prometheus.
- Corrects any discrepancies.