mwcli CI in Wikimedia GitLab (docker in docker)
mwcli is a golang CLI tool that I have been working on over the past year to replace the mediawiki-docker-dev development environment that I accidently created a few years back (among other things). I didn’t start the CLI, but I did this mediawiki-docker-dev like functionality.
As some point through the development journey it became clear that one of the ways to set the new and old environments apart would be through some rigorous CI and testing.
This started with CI running on a Qemu node as part of the shared Wikimedia Jenkins CI infrastructure that is hooked up to Gerrit, where the code was being developed. This ended up being quite slow, and involved lots of manual steps.
A next iteration saw the majority of development take place in my own fork on Github, making use of Github Actions. Changes would then be copied over to Gerrit for final review once CI tests had run.
And finally the repository moved to the new Wikimedia GitLab instance (work in progress), where I could make use of GitLab Runners powered by a machine in Wikimedia Cloud VPS.
Overview
I have a dedicated Cloud VPS project for the machine used as a runners for the mwcli project (T294283). Currently 2 runners are configured, each with 4 cores, 8GB memory and 20GB disks running debian buster.
The runners make use of Docker in docker, which is one of the documented ways to use the docker executor per the GitLab documentation. I haven’t done a full review of the possible security implications of this approach yet, and it should be noted the virtual machines only runs CI for this 1 project, and only members of the project have the ability to run the CI.
Installation
You need docker installed. You can follow the docker install guide, or do something like this…
sudo apt-get update
sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
gnupg \
lsb-release
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
"deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
Code language: PHP (php)
And you need code for GitLab runners installed. There is an install guide, and it looks something like this…
curl -LJO "https://gitlab-runner-downloads.s3.amazonaws.com/latest/deb/gitlab-runner_amd64.deb"
sudo dpkg -i gitlab-runner_amd64.deb
rm gitlab-runner_amd64.deb
Code language: JavaScript (javascript)
Registration
Once everything is installed, you are ready to register the runner, and connect it to the GitLab instance and project.
Head to Settings >> CI/CD on your project. Under the “Runners” section you should find a “registration token” which you’ll need to use on the runner.
This token can be used with the gitlab-runner register
command, along with a user provided name and some other options such as --limit
which limits the number of jobs that the runner can run at once.
sudo gitlab-runner register -n \
--url https://gitlab.wikimedia.org/ \
--registration-token xxxxxxxxxxxxxxxxxxxxxxx \
--executor docker \
--limit 3 \
--name "gitlab-runner-addshore-1012-docker-01" \
--docker-image "docker:19.03.15" \
--docker-privileged \
--docker-volumes "/certs/client"
Code language: JavaScript (javascript)
You should now see the runner appear in the GitLab UI.
Further Configuration
Concurrancy
Although we specified a limit of 3 jobs for the runner when registering it. This is only runner configuration. A single node and have multiple runners of multiple types (or of the same type). So there is also a node / global concurrency setting that needs to be changed.
sudo sed -i 's/^concurrent =.*/concurrent = 3/' "/etc/gitlab-runner/config.toml"
sudo systemctl restart gitlab-runner
Code language: JavaScript (javascript)
Docker mirror
If your CI will make use of images from Docker Hub or any other registry that imposes limits, or if you want to speed up CI, you may want to run and register a local docker mirror.
Again, you can follow a blog post for setup here, or do something like this…
Create the mirror in a container…
sudo docker run -d -p 6000:5000 \
-e REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io \
--restart always \
--name registry registry:2
Code language: JavaScript (javascript)
Get the IP address of the host…
hostname --ip-address
Add the mirror to the docker deamon config…
sudo echo '{"registry-mirrors": ["http://<CUSTOM IP>:<PORT>"]}' > /etc/docker/daemon.json
sudo service docker restart
Code language: HTML, XML (xml)
And also register it in the runner config, which you should find at /etc/gitlab-runner/config.toml
(see these docs for why this is also needed)
[[runners.docker.services]]
name = "docker:19.03.15-dind"
command = ["--registry-mirror", "http://<CUSTOM IP>:<PORT>"]
Code language: JavaScript (javascript)
Finally restart the runner one last time…
sudo systemctl restart gitlab-runner
Example CI
You could then configure some very basic jobs using the GitLab CI configuration file for the project.
image: docker:19.03.15
variables:
DOCKER_TLS_CERTDIR: "/certs"
services:
- name: docker:19.03.15-dind
docker_system_info:
only:
- web
stage: check
script:
- docker system info
docker_hub_quota_check:
only:
- web
stage: check
image: alpine:latest
before_script:
- apk add curl jq
script:
- |
TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq --raw-output .token) && curl --head --header "Authorization: Bearer $TOKEN" "https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest" 2>&1
Code language: JavaScript (javascript)
Gotchas & Reading
- The Wikimedia GitLab instance is still currently a work in progress.
- If using images from Docker Hub the limit can be annoying. As well as a mirror there is also documentation for providing a key for Docker Hub or another registry. (T288377)
- Depending on your CI, 20GB of disk can fill up quite quickly. While running at a concurrency of 4 I would occasionally hit disk limitations.
- When people open merge requests from forks CI will not and can not run using the project runners.
- Default caching is done per project, per runner, per job / concurrency slot. This can lead to a lot of duplication unless a shared cache is used!
Hi Adam, take a look at Kaniko https://docs.gitlab.com/ee/ci/docker/using_kaniko.html instead of privileged mode. I helped on that official documentation back in the day.
Also GitLab has some common job definitions like “.use-kaniko” that might help and you can browse the global.gitlab-ci.yml file for further comments. https://docs.gitlab.com/ee/development/pipelines.html#common-job-definitions
I think the WMF GitLab install will slowly move toward runners being on a k8s cluster, but its not there yet.
Also for mwcli the issue is not building docker images, but rather being able to run tests against a binary that in turn creates docker containers in CI.
Lots of the docs in this space creating solutions that are better than docker in docker keep saying “Docker-in-Docker requires privileged mode”, but as far as I know that is no longer actually true either!
But, these are my baby months in the space of GitlLab runners!
[…] This project is still a work in progress and a collaboration between the Wikimedia Release Engineering team and me. You can find the current documentation on mediawiki.org, git repository on the Wikimedia GitLab install and task tracker on the Wikimedia Phabricator instance. You can read more about the CI setup for the repository in my previous post. […]