It’s been roughly 1 month since WBStack appeared online, and it’s time for a quick review of what has been happening in the first month. If you don’t already know what WBStack is, then head to my introduction post.
The number of users and wikis has slowly been increasing. In my last post I stated ” 20 users on the project with 30 Wikibase installs”. 3 weeks after that post WBStack now sits at roughly 38 users with roughly 65 wikibases. Many of these wikibases are primarily users test wikis, but that’s great, the barrier to trying out Wikibase is definitely lowered.
If you would like an invite code to try WBStack, or have any related thoughts of ideas, then please get in touch.
As WBStack is a shared platform, all changes mentioned in this blog post are immediately visible on all hosted Wikibases. In the future there will be various options to turn things on and off, but at this early stage things are being kept simple.
WBStack is a project that I have been working on for a couple of years that finally saw the light of day at Wikidatacon 2019. It has gone through a couple of different names along the way, MWaas, WBaas, WikWiki, OpenCura and finally WBStack.
The idea behind the project is to provide Wikibase and surrounding services, such as a blazegraph query service, query service ui, quick statements, and others on a shared platform where installs, upgrades and maintenance are handeled centrally.
Many users of Wikibase find themselves in a position where they need to change the concept URI of an existing Wikibase for one or more reasons, such as a domain name update or desire to have https concept URIs instead of HTTP.
Below I walk through a minimal example of how this can be done using a small amount of data and the Wikibase Docker images. If you are not using the Docker images the steps should still work, but you do not need to worry about copying files into and out of containers or running commands inside containers.
It’s been another 9 months since my last blog post covering the Wikidata generated geo location maps that I have been tending to for a few years now. Writing this from a hammock, lets see what has noticeably changed in the last 9 months using a visual diff and my pretty reasonable eyes.
In 2016 I wrote a blog post with this exact title when moving all of my pictures from Facebook to Google photos. I wrote a hacky little script which met my needs and added exif data from a HTML Facebook data dump back to the images that came along with it.
I recently updated updated the Wikibase registry from Mediawiki version 1.30 to 1.31 and described the process in a recent post, so if you want to see what the current setup and docker-compose file looks like, head there.
As a summary the Wikibase Registry uses:
The wikibase/wikibase:1.31-bundle image from docker hub
The installation creation process is documented in this blog post, and some customization regarding LocalSettings and extensions was covered here. The current state of the docker-compose file can be seen below with private details redacted.
This docker-compose files is found in /root/wikibase-registry on the server hosting the installation. (Yes I know that’s a dumb place, but that’s not the point of this post)
I recently encountered this error while trying to run one of my docker setups.
ERROR:formediawiki-docker-dev_db-slave_1 Cannot start service db-slave:b'OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"/tmp/mwdd/entrypoint.sh\": stat /tmp/mwdd/entrypoint.sh: no such file or directory": unknown'
I have encountered errors like this before and it has always ended up being related to docker and sharing my drives to the linux VM that actually runs my containers.
Checking the shared drives menu of the docker UI everything seemed to be fine.
However when removing the drive share and re sharing the drive I got an error message saying that there was a “Firewall detected” and that “A firewall is blocking file Sharing between Windows and the containers. See documentation for more info”.
Over the years diagrams have appeared in a variety of forms covering various areas of the architecture of Wikidata. Now, as the current tech lead for Wikidata it is my turn.
Wikidata has slowly become a more and more complex system, including multiple extensions, services and storage backends. Those of us that work with it on a day to day basis have a pretty good idea of the full system, but it can be challenging for others to get up to speed. Hence, diagrams!
All diagrams can currently be found on Wikimedia Commons using this search, and are released under CC-BY-SA 4.0. The layout of the diagrams with extra whitespace is intended to allow easy comparison of diagrams that feature the same elements.
Wikidata is accessed through a Varnish caching and load balancing layer provided by the WMF. Users, tools and any 3rd parties interact with Wikidata through this layer.
Off to the right are various other external services provided by the WMF. Hadoop, Hive, Ooozie and Spark make up part of the WMF analytics cluster for creating pageview datasets. Graphite and Grafana provide live monitoring. There are many other general WMF services that are not listed in the diagram.
Finally we have our semi persistent and persistent storages which are used directly by Mediawiki and Wikibase. These include Memcached and Redis for caching, SQL(mariadb) for primary meta data, Blazegraph for triples, Swift for files and ElasticSearch for search indexing.