Toward the end of 2020 I spent some time blackbox testing data load times for WDQS and Blazegraph to try and find out which possible setting tweaks might make things faster.
I didn’t come to any major conclusions as part of this effort but will write up the approach and data nonetheless incase it is useful for others.
I expect the next step toward trying to make this go faster would be via some whitebox testing, consulting with some of the original developers or with people that have taken a deep dive into the code (which I started but didn’t complete).
WBStack is now in its 7th month with 76 user accounts who have created 226 MediaWiki sites running Wikibase, of which 145 are currently online (81 deleted sites). 295,000 edits have now been made in total, which is an increase of 95,000 in the last month, which roughly equates to 2 edits a minute for the month.
WBStack currently runs on a Google Cloud Kubernetes cluster made up of 2 virtual machines, one e2-medium and one e2-standard-2. This adds up to a current total of 4 vCPUs and 12GB of memory. No Google specific services make up any part of the core platform at this stage meaning WBStack can run wherever there is a Kubernetes cluster with little to no modification.
A simplified overview of the internals can be seen in the diagram below where blue represents the Google provided services, with green representing everything running within the kubernetes cluster.
It’s been roughly 1 month since WBStack appeared online, and it’s time for a quick review of what has been happening in the first month. If you don’t already know what WBStack is, then head to my introduction post.
The number of users and wikis has slowly been increasing. In my last post I stated ” 20 users on the project with 30 Wikibase installs”. 3 weeks after that post WBStack now sits at roughly 38 users with roughly 65 wikibases. Many of these wikibases are primarily users test wikis, but that’s great, the barrier to trying out Wikibase is definitely lowered.
If you would like an invite code to try WBStack, or have any related thoughts of ideas, then please get in touch.
As WBStack is a shared platform, all changes mentioned in this blog post are immediately visible on all hosted Wikibases. In the future there will be various options to turn things on and off, but at this early stage things are being kept simple.
WBStack is a project that I have been working on for a couple of years that finally saw the light of day at Wikidatacon 2019. It has gone through a couple of different names along the way, MWaas, WBaas, WikWiki, OpenCura and finally WBStack.
The idea behind the project is to provide Wikibase and surrounding services, such as a blazegraph query service, query service ui, quick statements, and others on a shared platform where installs, upgrades and maintenance are handeled centrally.
Many users of Wikibase find themselves in a position where they need to change the concept URI of an existing Wikibase for one or more reasons, such as a domain name update or desire to have https concept URIs instead of HTTP.
Below I walk through a minimal example of how this can be done using a small amount of data and the Wikibase Docker images. If you are not using the Docker images the steps should still work, but you do not need to worry about copying files into and out of containers or running commands inside containers.
I recently updated updated the Wikibase registry from Mediawiki version 1.30 to 1.31 and described the process in a recent post, so if you want to see what the current setup and docker-compose file looks like, head there.
As a summary the Wikibase Registry uses:
The wikibase/wikibase:1.31-bundle image from docker hub
The installation creation process is documented in this blog post, and some customization regarding LocalSettings and extensions was covered here. The current state of the docker-compose file can be seen below with private details redacted.
This docker-compose files is found in /root/wikibase-registry on the server hosting the installation. (Yes I know that’s a dumb place, but that’s not the point of this post)
Wikidata.org runs on MediaWiki with the Wikibase extension. But there is more to it than just that. The Wikibase extension itself is split into 3 different sections, being Lib, Repo and Client. There are also 6 other extensions all providing extra functionality to the site and it’s sisters. The extensions are also loaded on a different combination of Clients (such a Wikipedia) and the Repo itself (wikidata.org).