Wikibase docker images

December 3, 2017 9 By addshore

This is a belated post about the Wikibase docker images that I recently created for the Wikidata 5th birthday. You can find the various images on docker hub and matching Dockerfiles on github. These images combined allow you to quickly create docker containers for Wikibase backed by MySQL and with a SPARQL query service running alongside updating live from the Wikibase install.

A setup was demoed at the first Wikidatacon event in Berlin on the 29th of October 2017 and can be seen at roughly 41:10 in the demo of presents video which can be seen below.

The images

The ‘wikibase‘ image is based on the new official mediawiki image hosted on the docker store. The only current version, which is also the version demoed is for MediaWiki 1.29. This image contains MediaWiki running on PHP 7.1 served by apache. Right now the image does some sneaky auto installation of the MediaWiki database tables which might be disappearing in the future to make the image more generic.

The ‘wdqs‘ image is based on the official openjdk image hosted on the docker store. This image also only has one version, the current latest version of the Wikidata Query Service which is downloaded from maven. This image can be used to run the blazegraph service as well as run an updater that reads from the recent changes feed of a wikibase install and adds the new data to blazegraph.

The ‘wdqs-frontend‘ image hosts the pretty UI for the query service served by nginx. This includes auto completion and pretty visualizations. There is currently an issue which means the image will always serve examples for Wikidata which will likely not work on your custom install.

The ‘wdqs-proxy‘ image hosts an nginx proxy that restricts external access to the wdqs service meaning it is READONLY and also has a time limit for queries (not currently configurable). This is very important as if the wdqs image is exposed directly to the world then people can also write to your blazegraph store.

You’ll also need to have some mysql server setup for wikibase to use, you can use the default mysql or mariadb images for this, this is also covered in the example below.

All of the wdqs images should probably be renamed as they are not specific to Wikidata (which is where the wd comes from), but right now the underlying repos and packages have the wd prefix and not a wb prefix (for Wikibase) so we will stick to them.

Compose example

The below example configures volumes for all locations with data that should / could persist. Wikibase is exposed on port 8181 with the query service UI on 8282 and the queryservice itself (behind the proxy) on 8989.

Each service has a network alias defined (that probably isn’t needed in most setups), but while running on WMCS it was required to get around some bad name resolving.

version: '3'

services:
  wikibase:
    image: wikibase/wikibase
    restart: always
    links:
      - mysql
    ports:
     - "8181:80"
    volumes:
      - mediawiki-images-data:/var/www/html/images
    depends_on:
    - mysql
    networks:
      default:
        aliases:
         - wikibase.svc
  mysql:
    image: mariadb
    restart: always
    volumes:
      - mediawiki-mysql-data:/var/lib/mysql
    environment:
      MYSQL_DATABASE: 'my_wiki'
      MYSQL_USER: 'wikiuser'
      MYSQL_PASSWORD: 'sqlpass'
      MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
    networks:
      default:
        aliases:
         - mysql.svc
  wdqs-frontend:
    image: wikibase/wdqs-frontend
    restart: always
    ports:
     - "8282:80"
    depends_on:
    - wdqs-proxy
    networks:
      default:
        aliases:
         - wdqs-frontend.svc
  wdqs:
    image: wikibase/wdqs
    restart: always
    build:
      context: ./wdqs/0.2.5
      dockerfile: Dockerfile
    volumes:
      - query-service-data:/wdqs/data
    command: /runBlazegraph.sh
    networks:
      default:
        aliases:
         - wdqs.svc
  wdqs-proxy:
    image: wikibase/wdqs-proxy
    restart: always
    environment:
      - PROXY_PASS_HOST=wdqs.svc:9999
    ports:
     - "8989:80"
    depends_on:
    - wdqs
    networks:
      default:
        aliases:
         - wdqs-proxy.svc
  wdqs-updater:
    image: wikibase/wdqs
    restart: always
    command: /runUpdate.sh
    depends_on:
    - wdqs
    - wikibase
    networks:
      default:
        aliases:
         - wdqs-updater.svc

volumes:
  mediawiki-mysql-data:
  mediawiki-images-data:
  query-service-data:

Questions

I’ll vaugly keep this section up to date with Qs & As, but if you don’t find you answer here, leave a comment, send an email or file a phabricator ticket.

Can I use these images in production?

I wouldn’t really recommend running any of these in ‘production’ yet as they are new and not well tested. Various things such as upgrade for the query service and upgrades for mediawiki / wikibase are also not yet documented very well.

Can I import data into these images from an existing wikibase / wikidata? (T180216)

In theory, although this is not documented. You’ll have to import everything using an XML dump of the existing mediawiki install, the configuration will also have to match on both installs. When importing using an XML dump the query service will not be updated automatically, and you will likely have to read the manual.

Where was the script that you ran in the demo video?

There is a copy in the github repo called setup.sh, but I can’t guarantee it works in all situations! It was specifically made for a WMCS debian jessie VM.

Links