2022 – Page 2 – addshore

Modifying default width of WordPress pages using new themes like Blockbase

June 1, 2022 by addshore

I switched to the Blockba s e WordPress theme a few weeks ago as it supports full-site editing, which brings blocks to all parts of your site rather than just posts and pages.

I found the content to be quite narrow out of the box at 620px.

Screenshot showing narrow text out of the box with the Blocbase theme — Screenshot of the default content width (620px) from my 4K display

I really wanted the default view for folks with wider screens such as myself to be a little wider, so I went on an adventure to change this.

Wikidata query service updater evolution

April 12, 2022 by addshore

The Wikidata Query Service (WDQS) sits in front of Wikidata and provides access to query its data via a SPARQL API. The query service itself is built on top of Blazegraph, but in many regards is very similar to any other triple store that provides a SPARQL API.

In the early days of the query service (circa 2015), the service was only run by Wikidata, hence the name. However, as interest and usage of Wikibase continued to grow more people started running a query service of their own, for data in their own Wikibase. But you’ll notice most people still refer to it as WDQS today.

Whereas most core Wikibase functionality is developed by Wikimedia Deutschland, the query service is developed by the search platform team at the Wikimedia Foundation, with a focus on wikidata.org, but also a goal of keeping it useable outside of Wikimedia infrastructure.

The query service itself currently works as a whole application rather than just a database. Under the surface, this can roughly be split into 2 key parts

Backend Blazegraph database that stores and indexes data
Updater process that takes data from a Wikibase and puts it in the database

This actually means that you can run your own query service, without running a Wikibase at all. For example, you can load the whole of Wikidata into a query service that you operate, and have it stay up to date with current events. Though in practice this is quite some work, and expense on storage and indexing and I expect not many folks do this.

Over time the updater element of the query service updater has iterated through some changes. The updater now packaged with Wikibase as used by most folks outside of the Wikimedia infrastructure is now 2 steps behind the updater used for Wikidata itself.

The updater generations look something like this:

HTTP API Recent Changes polling updater (used by most Wikibases)
Kafka based Recent Changes polling updater
Streaming updater (used on Wikidata)

Let’s take a look at a high-level overview of these updaters, what has changed and why. I’ll also be applying some pretty arbitrary / gut feeling scores to 4 categories for each updater.

Infrastructure as Code for wbstack deployments

August 28, 2023April 5, 2022 by addshore

This entry is part 12 of 12 in the series WBStack

For most of its life wbstack was a mostly one-man operation. This certainly sped up the decision making process around features, requests, communication and prioritization, I also had to maintain a complex and young project supporting hundreds of sites on the side of my regular 8 hour day job.

In order to ensure that I’d feel comfortable with this extra context, be able to support the platform for multiple years, have a platform that could grow and scale from day one and also leave the future of the platform with as many possibilities as possible I roughly followed a few principles throughout implementation and operation.

Scalability: Tink about scale at multiple levels. Everything was either already horizontally scalable, or the path to get to horizontal scalability had been thought out
Automation: Automate actions, if you have 2 of something now, pretend you have 1000 of them instead and develop the solution to fit
Infrastructure as code: All infrastructure configuration was contained somehow in the deploy repository
Cloud agnostic: Things would be cloud-agnostic where possible, resulting in most things being in Kubernetes or using other external services
Own fewer things: Try to not create many new services or codebases, or take ownership of forks that should not exist, as this will become too much work

The one part of the above list that I want to dive into more in this post is infrastructure as code and how it worked for the multi-year lifespan of wbstack, before the move to wikibase.cloud.

WikiCrowd at 50k answers

April 4, 2022April 2, 2022 by addshore

In January 2022 I published a new Wikimedia tool called WikiCrowd.

This tool allows people to answer simple questions to contribute edits to Wikimedia projects such as Wikimedia Commons and Wikidata.

It’s designed to be able to deal with a wide variety of questions, but due to time constraints, the extent of the current questions covers Aliases for Wikidata, and Depict statements for Wikimedia Commons.

Wikidata maxlag, via the ApiMaxLagInfo hook

March 4, 2022 by addshore

Wikidata tinkers with the concept of maxlag that has existed in MediaWiki for some years in order to slow automated editing at times of lag in various systems.

Here you will find a little introduction to MediaWiki maxlag, and the ways that Wikidata hooks into the value, altering it for its needs.

Screenshot of the “Wikidata Edits” grafana dashboard showing increased maxlag and decreased edits

As you can see above, a high maxlag can cause automated editing to reduce or stop on wikidata.org

Altering a Gerrit change (git-review workflow)

February 25, 2022 by addshore

I don’t use git-review for Gerrit interactions. This is primarily because back in 2012/2013 I couldn’t get git-review installed, and someone presented me with an alternative that worked. Years later I realized that this was actually the documented way of pushing changes to Gerrit.

As a little introduction to what this workflow looks, and a comparison with git-review I have created 2 overview posts altering a gerrit change on the Wikimedia gerrit install. I’m not trying to convince you, either way, is better, merely show the similarities/difference and what is happening behind the scenes.

Be sure to take a look at the other post “Altering a Gerrit change (git workflow)“

One prerequisite of this workflow is that you have git-review installed and a .gitreview file in your repository!

I’ll be taking a change from the middle of last year, rebasing it, making a change, and pushing it back for review. Fundamentally the 2 approaches do the same thing, just one (git-review) requires an external tool.

Altering a Gerrit change (git workflow)

April 2, 2022February 25, 2022 by addshore

Be sure to take a look at the other post “Altering a Gerrit change (git-review workflow)“

Small commits

February 24, 2022 by addshore

There are many blog posts and articles out there about making small git commits. I’m sure most people (including me) bring up the same few topics around why small commits are good and why we should all probably be making smaller commits.

In this post, I’ll look at some of the key topics from my perspective, and try to tie these topics to concrete examples from repositories that I have worked on. The topics are in no particular order, so be sure to give them all a read.

One thing to note is that “small” doesn’t necessarily mean small in terms of lines of code. Small here is also relative. Also, small commits can benefit you in many different places, but to stand the test of time they must end up in your main branch.

Git features during development

Git only takes full responsibility for your data when you commit
Commit Often, Perfect Later, Publish Once: Git Best Practices

Wikibase a history

February 15, 2022 by addshore

I have had the pleasure of being part of the Wikibase journey one way or another since 2013 when I first joined Wikimedia Germany to work on Wikidata. That long-running relation to the project should put me in a fairly good position to give a high-level overview of the history, from both a technical and higher-level perspective. So here it goes.

For those that don’t know Wikibase is code that powers wikidata.org, and a growing number of other sites. If you want to know more read about it on Wikipedia, or the Wikibase website.

For this reason, a lot of the early timeline is quite heavy on the Wikidata side. There are certainly some key points missing, if you think they are worthy of mentioning then leave a comment or reach out!

Profiling a Wikibase item creation on test.wikidata.org

April 2, 2022February 3, 2022 by addshore

Today I was in a Wikibase Stakeholder group call, and one of the discussions was around Wikibase importing speed, data loading, and the APIs. My previous blog post covering what happens when you make a new Wikibase item was raised, and we also got onto the topic of profiling.

So here comes another post looking at some of the internals of Wikibase, through the lens of profiling on test.wikidata.org.

The tools used to write this blog post for Wikimedia infrastructure are both open source, and also public. You can do similar profiling on both your own Wikibase, or for your requests that you suspect are slow on Wikimedia sites such as Wikidata.

Wikimedia Profiling

Profiling of Wikimedia sites is managed and maintained by the Wikimedia performance team. They have a blog, and one of the most recent posts was actually covering profiling PHP at scale in production, so if you want to know the details of how this is achieved give it a read.

Throughout this post I will be looking at data collected from a production Wikimedia request, by setting the X-Wikimedia-Debug header in my request. This header has a few options, and you can find the docs on wikitech.wikimedia.org. There are also browser extensions available to easily set this header on your requests.

I will be using the Wikimedia hosted XHGui to visualize the profile data. Wikimedia specific documentation for this interface also exists on wikitech.wikimedia.org. This interface contains a random set of profiled requests, as well as any requests that were specifically requested to be profiled.

Profiling PHP & MediaWiki

If you want to profile your own MediaWiki or Wikibase install, or PHP in general, then you should take a look at the mediawiki.org documentation page for this. You’ll likely want to use either Tideways or XDebug, but probably want to avoid having to setup any extra UI to visualize the data.

This profiling only covered the main PHP application (MediaWiki & Wikibase extension). Other services such as the query service would require separate profiling.