Wikibase ecosystem in Q1 2025, according to wikibase.world

I wrote a post in February 2025 looking at what the Wikibase ecosystem (might) look like, according to the data that had at that point been collected on wikibase.world. Now that data has had some time to evolve and expand, we can take a little look at how it has changed throughout the last 2 months.

In the future, I’ll try to remember to write something up every quarter or so (for now), until someone else feels like taking this over ;)

The latest notebooks for generating this are in git, and the latest XML file dumped from wikibase.world is on archive.org.

Site count and status

We have gone from tracking 777 sites, up to 873, so an increase of nearly 100 in 2 months.

However, we need to look at the tracked status to determine how big the current ecosystem actually might be. So I added a little extra counting to the notebook previously used to count the wikis based on P13 (availability status).

Back in Feb, there were 774 online, and only 3 marked as offline. This was primarily as Addbot was not often marking sites as offline, however I added automatic detection of deleted sites for wikibase.cloud and went through and checked a bunch of sites that the scripts were failing to lookup.

Looking at the April data, we have ~847 online, and ~26 offline, so an increase of around 3%.

Graph

Most of the growth in sites seems to come from wikibase.cloud, however many sites on wikibase.cloud are test sites and may not have much content.

So when displaying the graph this time, I’ll filter out everything that doesn’t have a highest Item ID of at least 25, this roughly cuts the size of the graph in half.

Read more

Smart Home: A fleet of Temperature and Humidity Sensors

One of the easiest ways to get myself into the Zigbee life without needing to worry too much about exactly what I was doing, buying or what my goals were was to buy a set of Temperature and Humidity Sensors for every room of the house.

After a tiny amount of research and some discussion among friends, I settled on a fleet of Aquara sensors. These work well with home assistant, use batteries that I have many of already and want to use up, are visually appealing and can be bought in bulk on AliExpress.

For a hub, I went for the SMLIGHT SLZB-06 that has some good reviews in terms of flexibility and openness, as well as allowing use via wired network, Wi-Fi or even USB.

Everything was extremely easy to set up, wiring the hub into the network and having it appear in Home Assistant running on my Raspberry Pi 4, and then pairing each of the sensors with the Hub and having them appear within Home assistant.

Read more

Wikidata user and project talk page connection graph

Talk pages are a pretty key part of how wikis have worked over the years. Realtime chat apps and services are probably changing this dynamic somewhat, but they are still used, and also most of the history of these pages is still recorded.

I started up an IPython Notebook to try and take a look at some of the connections between different users on Wikidata over the years. Below you’ll find a few representations of these connections, as well as notable things I spotted along the way, the generating code, SQL query and more!

The data

MediaWiki maintains links tables for all pages, so getting all of the current links out of Wikidata is very easy. I made use of the Wikimedia Cloud Quarry service to run this query and host a CSV of the results.

SELECT
  SUBSTRING_INDEX(page_title, '/', 1) AS t1,
  pl_from_namespace AS t1ns,
  SUBSTRING_INDEX(pl_title, '/', 1) AS t2,
  pl_namespace AS t2ns
FROM pagelinks, page
WHERE pl_namespace IN (3,5) AND pl_from_namespace IN (3,5)
AND page_id = pl_from AND page_title != pl_title
GROUP BY t1, t2Code language: PHP (php)

I then loaded this data directly into an IPython Notebook and did some cleaning, such as removing all IP addresses. I then spent quite some time applying more filtering and twiddling knobs to try and get some graphics out that are easy to look at. The first attempts looked like solid blobs as you can see in this tweet.

You can find a copy of the Notebook on notebooksharing.space.

Read more

Language usage on Wikidata

the Wikidata LogoWikidata is a multilingual project, but due to the size of the project it is hard to get a view on the usage of languages.

For some time now the Wikidata dashboards have existed on the Wikimedia grafana install. These dashboards contain data about the language content of the data model by looking at terms (labels, descriptions and aliases) as well as data about the language distribution of the active community.

For reference the dashboard used are:

All data below was retrieved on 1 February 2016

Read more

The break in Wikidata edits on 28 Jan 2016

On the 28th of January 2016 all Wikimedia MediaWiki APIs had 2 short outages. The outage is documented on Wikitech here. The outage didn’t have much of an impact on most projects hosted by Wikimedia. However due to most Wikidata editing happening through the API, even when using the UI, the project basically stopped for roughly … Read more