it's a blog

Infrastructure as Code for wbstack deployments

For most of its life wbstack was a mostly one-man operation. This certainly sped up the decision making process around features, requests, communication and prioritization, I also had to maintain a complex and young project supporting hundreds of sites on the side of my regular 8 hour day job.

In order to ensure that I’d feel comfortable with this extra context, be able to support the platform for multiple years, have a platform that could grow and scale from day one and also leave the future of the platform with as many possibilities as possible I roughly followed a few principles throughout implementation and operation.

  • Scalability: Tink about scale at multiple levels. Everything was either already horizontally scalable, or the path to get to horizontal scalability had been thought out
  • Automation: Automate actions, if you have 2 of something now, pretend you have 1000 of them instead and develop the solution to fit
  • Infrastructure as code: All infrastructure configuration was contained somehow in the deploy repository
  • Cloud agnostic: Things would be cloud-agnostic where possible, resulting in most things being in Kubernetes or using other external services
  • Own fewer things: Try to not create many new services or codebases, or take ownership of forks that should not exist, as this will become too much work

The one part of the above list that I want to dive into more in this post is infrastructure as code and how it worked for the multi-year lifespan of wbstack, before the move to wikibase.cloud.

What is infrastructureas code?

Infrastructure as code or IaC is the process of provisioning or managing infrastructure through code. This mostly means no manual processes or operations through the console or CLI, rather defining things in code and running from there.

Generally, infrastructure as code saves lots of time managing a deployment and also redeploying it in another environment, as everything is documented one way or another in code, and should re re runnable in a new environment.

There are at least two recognized approaches to IaC, Imperative (Procedural) and Declarative (Functional).

  • Imperative: Define the steps to execute in order to reach the desired solution
  • Declarative: Define the desired state of the desired solution

Different tools generally target different IaC approaches.

Chef and Ansible encourage a procedural style where you write code that specifies, step-by-step, how to to achieve some desired end state. Terraform, CloudFormation, SaltStack, and Puppet all encourage a more declarative style where you write code that specifies your desired end state, and the IAC tool itself is responsible for figuring out how to achieve that state.

Why we use Terraform and not Chef, Puppet, Ansible, SaltStack, or CloudFormation

The main benefits of IaC are:

  • Integration with version control leads to trackable, auditable infrastructure changes and easy rollbacks
  • The addition of automation in the form of CI/CD and easily be added as part of the delivery process
  • Ensure repeatability and eliminate configuration drift
  • Introduce standardizations across infrastructure conveniently (names, tags, settings)
  • Create modular infrastructure that can be used across projects
  • Increase deployment speed and reduce errors

And many of these benefits help individuals or small teams manage complex systems without getting into a mess.

The wbstack deploy repo

The deployment of wbstack has always been controlled in git, but the first commit came to the deploy repository when wbstack was fully open-sourced a year or so ago. Before that time it was controlled in a private git repository which remains private today.

The open-sourcing was partly to enable the creation of wikibase.cloud by Wikimedia Deutschland, taking advantage of the repeatability aspect of IaC. The whole infrastructure setup was and is possible to see in the git repository, including all changes that happened since the time of open sourcing.

Taking a quick look at the repository, we can see that it is separated by infrastructure or method of controlling infrastructure:

  • docs : General documentation for the repository and infrastructure
  • gce: Google Cloud specific infrastructure (legacy)
  • k8s: Kubernetes infrastructure (which was running on Google Kubernetes Engine)
  • tf: Terraform code for defining infrastructure (would replace the gce directory)

This repository ended up being a mixture of imperative and declarative infrastructure as code, entirely running with no automation around testing or deployments (no CI/CD) but using a variety of tools to do so.

  • The gce directory mainly contains subdirectories for parts of google cloud infrastructure, such as DNS or disk snapshots each containing one or more mostly idempotent scripts for the creation / continued management of this part of the infrastructure.
  • The tf directory was slowly replacing these less ideal scripts that had existed for years with terraform definitions for the resources being created, but at the time of writing this only monitoring, and checks are covered.
  • The k8s directory contains subdirectories for definitions which are groups of Kubernetes resources that can be applied to a cluster, alongside helmfile which contains charts and deployments of those charts for deployment on a cluster.

Some of the scripts that didn’t make their way out of the private wbstack repository were the scripts used to tie all of this together. An example would be this script below which tied together multiple helmfile deployed services in a single script. At a higher level, all of these scripts used to be tied together and you could run one sync command to apply the repository to the infrastructure.

#!/usr/bin/env bash cd ./sql && helmfile apply && cd .. cd ./queryservice && helmfile apply && cd .. cd ./redis && helmfile apply && cd ..
Code language: JavaScript (javascript)

Key takeaways

The key takeaways from a couple of years working in this repository:

Having everything in git gave me the confidence to single-handedly run the platform, and leave it for multiple months before coming back and easily being able to remember the state of things. Since open sourcing, the longest the deploy repo remained untouched was 2 months, but before open-sourcing this was 6 months.

Having the infrastructure as code made onboarding people into the system ahead of the creation of wikibase.cloud much easier. I didn’t need to try to remember how things were setup, or talk through operations that have been run a year before, instead, you could just point at the commit.

I could have invested some time back at the start in more CI and CD for wbstack, and moved more towards a Gitops workflow, however at the time I thought that this would have detracted from me actually working on the platform itself, instead of focusing on the infrastructure. Even looking back years later I think I made the right decision. Managing IaC without CI/CD as an individual was no real extra effort and I would certainly have done less platform development or just have more things to maintain (CI/CD) now if I’d chosen otherwise.

If I were to continue maintaining wbstack.com I would certainly move toward a Gitops approach starting with something and on the edge of the wbstack deployment, quickstatements perhaps? But any hint that the infrastructure code or CI/CD code was starting to overwhelm my context or time I’d probably rethink things again.

So this Gitops thing?

“GitOps” concept born in 2017 to treat “Infrastructure as code” the same way as “Application as Code”. GitOps eliminates the flaws of “IaC”.

GitOps vs Infrastructure as Code on UnixArena

Gitlab describe Gitops as IaC + MRs + CI/CD. Thus I decided that as I don’t have CI/CD and I don’t bother doing PRs/MRs as I am working by myself then I am not doing Gitops, but only IaC.

But perhaps in a world where you only work as an individual and thus don’t need PRs/MRs for code review, and also don’t have CI/CD but do have a single command to deploy the entire repository state to the infrastructure, are you perhaps doing some form of Individual manual Gitops? At least in principle.

One part that is perhaps missing is the CI on PRs/MRs?

I wanted to dive into this topic more in this post, but the summary of the IaC and the wbstack deploy repo took quite some time and words…

Reading

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.