mwcli CI in Wikimedia GitLab (docker in docker)

November 1, 2021 4 By addshore

mwcli is a golang CLI tool that I have been working on over the past year to replace the mediawiki-docker-dev development environment that I accidently created a few years back (among other things). I didn’t start the CLI, but I did this mediawiki-docker-dev like functionality.

As some point through the development journey it became clear that one of the ways to set the new and old environments apart would be through some rigorous CI and testing.

This started with CI running on a Qemu node as part of the shared Wikimedia Jenkins CI infrastructure that is hooked up to Gerrit, where the code was being developed. This ended up being quite slow, and involved lots of manual steps.

A next iteration saw the majority of development take place in my own fork on Github, making use of Github Actions. Changes would then be copied over to Gerrit for final review once CI tests had run.

And finally the repository moved to the new Wikimedia GitLab instance (work in progress), where I could make use of GitLab Runners powered by a machine in Wikimedia Cloud VPS.

Screenshot of GitLab pipelines in action for the mwcli project

Overview

I have a dedicated Cloud VPS project for the machine used as a runners for the mwcli project (T294283). Currently 2 runners are configured, each with 4 cores, 8GB memory and 20GB disks running debian buster.

The runners make use of Docker in docker, which is one of the documented ways to use the docker executor per the GitLab documentation. I haven’t done a full review of the possible security implications of this approach yet, and it should be noted the virtual machines only runs CI for this 1 project, and only members of the project have the ability to run the CI.

Installation

You need docker installed. You can follow the docker install guide, or do something like this…

sudo apt-get update sudo apt-get remove docker docker-engine docker.io containerd runc sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo \ "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io
Code language: PHP (php)

And you need code for GitLab runners installed. There is an install guide, and it looks something like this…

curl -LJO "https://gitlab-runner-downloads.s3.amazonaws.com/latest/deb/gitlab-runner_amd64.deb" sudo dpkg -i gitlab-runner_amd64.deb rm gitlab-runner_amd64.deb
Code language: JavaScript (javascript)

Registration

Once everything is installed, you are ready to register the runner, and connect it to the GitLab instance and project.

Head to Settings >> CI/CD on your project. Under the “Runners” section you should find a “registration token” which you’ll need to use on the runner.

This token can be used with the gitlab-runner register command, along with a user provided name and some other options such as --limit which limits the number of jobs that the runner can run at once.

sudo gitlab-runner register -n \ --url https://gitlab.wikimedia.org/ \ --registration-token xxxxxxxxxxxxxxxxxxxxxxx \ --executor docker \ --limit 3 \ --name "gitlab-runner-addshore-1012-docker-01" \ --docker-image "docker:19.03.15" \ --docker-privileged \ --docker-volumes "/certs/client"
Code language: JavaScript (javascript)

You should now see the runner appear in the GitLab UI.

Further Configuration

Concurrancy

Although we specified a limit of 3 jobs for the runner when registering it. This is only runner configuration. A single node and have multiple runners of multiple types (or of the same type). So there is also a node / global concurrency setting that needs to be changed.

sudo sed -i 's/^concurrent =.*/concurrent = 3/' "/etc/gitlab-runner/config.toml" sudo systemctl restart gitlab-runner
Code language: JavaScript (javascript)

Docker mirror

If your CI will make use of images from Docker Hub or any other registry that imposes limits, or if you want to speed up CI, you may want to run and register a local docker mirror.

Again, you can follow a blog post for setup here, or do something like this…

Create the mirror in a container…

sudo docker run -d -p 6000:5000 \ -e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \ --restart always \ --name registry registry:2
Code language: JavaScript (javascript)

Get the IP address of the host…

hostname --ip-address

Add the mirror to the docker deamon config…

sudo echo '{"registry-mirrors": ["http://<CUSTOM IP>:<PORT>"]}' > /etc/docker/daemon.json sudo service docker restart
Code language: HTML, XML (xml)

And also register it in the runner config, which you should find at /etc/gitlab-runner/config.toml (see these docs for why this is also needed)

[[runners.docker.services]] name = "docker:19.03.15-dind" command = ["--registry-mirror", "http://<CUSTOM IP>:<PORT>"]
Code language: JavaScript (javascript)

Finally restart the runner one last time…

sudo systemctl restart gitlab-runner

Example CI

You could then configure some very basic jobs using the GitLab CI configuration file for the project.

image: docker:19.03.15 variables: DOCKER_TLS_CERTDIR: "/certs" services: - name: docker:19.03.15-dind docker_system_info: only: - web stage: check script: - docker system info docker_hub_quota_check: only: - web stage: check image: alpine:latest before_script: - apk add curl jq script: - | TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq --raw-output .token) && curl --head --header "Authorization: Bearer $TOKEN" "https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest" 2>&1
Code language: JavaScript (javascript)

Gotchas & Reading

  • The Wikimedia GitLab instance is still currently a work in progress.
  • If using images from Docker Hub the limit can be annoying. As well as a mirror there is also documentation for providing a key for Docker Hub or another registry. (T288377)
  • Depending on your CI, 20GB of disk can fill up quite quickly. While running at a concurrency of 4 I would occasionally hit disk limitations.
  • When people open merge requests from forks CI will not and can not run using the project runners.
  • Default caching is done per project, per runner, per job / concurrency slot. This can lead to a lot of duplication unless a shared cache is used!