During the first 3 months of 2021, some Wikimedia Deutschland engineers, from the Wikidata / Wikibase team, spent some time working on WBStack as part of an effort to explore the WBaaS (Wikibase as a service) topic during the year, as outlined by the development plan.
We want to make it easier for non-Wikimedia projects to set up Wikibase for the first time and to evaluate the viability of Wikibase as a Service.
This has lead to a few new Wikibase features being exposed through the WBStack dashboard for sites that run on the platform. These features are primarily features developed by the Wikibase team in 2020 and 2021. The work also brought some other quality of life improvements for the settings pages.
Here is a quick rundown of what’s new and improved.
WBStack is a platform allowing shared scalable hosting of Wikibase and surrounding services.
A year ago I made an initial post covering the state of WBStack infrastructure. Since then some things have changed, and I have also had more time to create a clear diagram. So it is time for the 2021 edition.
I have been thinking about writing a Twitter bot for some time and decided to copy this pattern running a cron based Twitter bot on Github Actions, with an added bit of free persistence using jsonstorage.net.
This is the question that I asked a group of around 50 people this evening as part of a food for thought exercise. 37 replied before I decided to write this blog post, and here is a quick summary…
The question was “If you bake 2 lasagnes, and then put one on top of the other, do you have 1 or 2 lasagnes?”
The answers ranged dramatically, and the group was fairly divided, but slightly leaning toward an answer of 1, with ~24 saying that you have 1 lasagne, and ~10 saying 2 lasagnes. Some of the answers were a bit vague and hard to stick into one of these categories.
VSCode seems to be one of the up and coming IDEs over the last year. I personally switched from Jetbrains IDEs to VSCode fo most of my development work at some point in 2020.
Apparently up until now I have avoided running the PHP debugger Xdebug. I had to do a little search around to figure out the correct settings for my setup decided to write them down in this handy blog post.
dockerit is a small CLI tool that I have been working on during the start of 2021. It’s intended to make running one off commands and CLI tooling easier in docker. Rather than having to write a long string of parameters for docker run, instead you can just use dockerit. This applies to both CLI usage, but also via bash aliases.
Github provided some advice for renaming branches focused around how this can be done in the UI. But with the Github CLI tool that was also release at the end of 2020 you can programatically make this change too.
The Github CLI tool is called gh. The README for the tool has installation instructions for a variety of platforms. One of the commands that it enables is called api, which can be used to “Make an authenticated GitHub API request”.
The Github API obviously allows you to perform most actions that can be performed by the UI, this we can use the gh cli tool to rename the branches of repositories.
For example, for the addwiki/addwiki repository, we can rename the master to main branch using the following command:
gh api --method POST repos/addwiki/addwiki/branches/master/rename --field new_name='main'
And you can use such a command in whatever loops etc you want to, to quickly rename a whole bunch of branches in a whole bunch of repositories.
Toward the end of 2020 I spent some time blackbox testing data load times for WDQS and Blazegraph to try and find out which possible setting tweaks might make things faster.
I didn’t come to any major conclusions as part of this effort but will write up the approach and data nonetheless incase it is useful for others.
I expect the next step toward trying to make this go faster would be via some whitebox testing, consulting with some of the original developers or with people that have taken a deep dive into the code (which I started but didn’t complete).
One of the biggest sticking points for me, being fairly new with the Golang world, was trying to pass stdin stdout and stderr between the container and host terminal correctly, while also having good performance and doing the expected things (like Ctrl+C to cancel).
The full code for setting up and, interacting with and removing my container can be found here. The main steps are broken down below.