Wikidata Map in 2025

This entry is part 17 of 17 in the series Wikidata Map

Another year, another map, and another Birthday for Wikidata. Last generated in 2024 by @tarrow and @outdooracorn, this year I have put the work in just ahead of the 13th Wikidata birthday to have a look at what’s changed in terms of items with coordinates this past year on Wikidata.

And here it is!

But really you need to look at the diff between previous years to see what has changed!

Read more

Wikidata, instance of and subclass of through time (P31 & P279)

Last month I looked at all Wikimedia Commons revisions and managed to generate some data and graphs for the usage of depicts statements since they were introduced on the project.

This month, I have applied the same analysis on Wikidata but looking at instance of and subclasses of items. A slightly bigger data set, however essentially the same process.

This will enable easy updating, of various pie charts that have been published over the years, such as

In future, this could be easily adapted to show per Wikipedia project graphs, such as those that are currently at Wikidata:Statistics/Wikipedia

Method

The details of the method can be seen in code in my previous post about depicts statements, and this mostly stays the same.

In words:

  • Look at every revision of Wikidata ever
  • Parse the JSON to determine what values there are for P31 and P279 for each revision
  • Find the latest revision of each item in each given month, and thus find the state of all items in that month
  • Plot the data by number of items that are P31 or P279 of each value item

There are some minor defects to this logic currently that could be cleaned up with future iterations:

  • Deleted items will continue being counted, as I don’t consider the point items are deleted
  • Things will be double counted in this data, as 1 item may have multiple P31 and P279 values, and I don’t try to join these into higher level concept at all

We make an OTHER and UNALLOCATED count as part of the final data summarization. OTHER accounts for things that have not made it into the top 20 items by count, and UNALLOCATED means that we didn’t have a P31 or P279 value in the latest revision.

2025

For August 2025 (or at least part way through it), this is the current state of Wikidata per the above method.

You can now find a PNG of this pie chart on Wikimedia Commons https://commons.wikimedia.org/wiki/File:Wikidata_P31_%26_P279_analysis_August_2025.png

Read more

Wikibase ecosystem in Q1 2025, according to wikibase.world

I wrote a post in February 2025 looking at what the Wikibase ecosystem (might) look like, according to the data that had at that point been collected on wikibase.world. Now that data has had some time to evolve and expand, we can take a little look at how it has changed throughout the last 2 months.

In the future, I’ll try to remember to write something up every quarter or so (for now), until someone else feels like taking this over ;)

The latest notebooks for generating this are in git, and the latest XML file dumped from wikibase.world is on archive.org.

Site count and status

We have gone from tracking 777 sites, up to 873, so an increase of nearly 100 in 2 months.

However, we need to look at the tracked status to determine how big the current ecosystem actually might be. So I added a little extra counting to the notebook previously used to count the wikis based on P13 (availability status).

Back in Feb, there were 774 online, and only 3 marked as offline. This was primarily as Addbot was not often marking sites as offline, however I added automatic detection of deleted sites for wikibase.cloud and went through and checked a bunch of sites that the scripts were failing to lookup.

Looking at the April data, we have ~847 online, and ~26 offline, so an increase of around 3%.

Graph

Most of the growth in sites seems to come from wikibase.cloud, however many sites on wikibase.cloud are test sites and may not have much content.

So when displaying the graph this time, I’ll filter out everything that doesn’t have a highest Item ID of at least 25, this roughly cuts the size of the graph in half.

Read more

Visualizing Wikibase ecosystem, using wikibase.world

This entry is part 2 of 3 in the series Wikibase ecosystem

In October last year, I wrote a post starting to visualize the connections between Wikibases in the ecosystem that had been found and collected on wikibase.world thanks to my bot that I occasionally run. That post made use of the query service visualizations, and in this post I’ll take the visualizations a step further, making use of IPython notebooks and plotly.

Previously I reported the total number of Wikibases tracked in wikibase.world being around 784, with around 755 being active (however I didn’t write down exactly how I determined this). So I’m going to take another stab at that with some code backing up the determinations, rather than just my late night data ramblings.

All of the data shown in this post is generated from the IPython notebook available on Github, on 16 Feb 2025, based on the data on wikibase.world which is maintained as a best effort system.

General numbers

MetricValue
Wikibases with properties777
Wikibases with properties, and more than 10 pages600
Wikibases with properties, and more than 10 pages, and 1 or more active users264
Wikibases with properties, and more than 10 pages, and 2 or more active users129
Wikibases that link to other wikibases194
Wikibases that only link to non Wikimedia Foundation wikibases5
Wikibases that link to other wikibases, excluding Wikimedia Foundation35

A few things of note:

  • “with properties” is used, as a clear indicator that Wikibase is not only installed, but also used in at least a very basic way. (ie, it has a created Wikibase property). I would use the number of items ideally as a measure here, however as far as I can tell, this is hard to figure out?)
  • “with more than 10 pages” is my baseline measure of the site having some content, however this applies across all namespaces, so can also be wikitext pages…
  • “active users” are taken from MediaWiki statistics, and apply across all namespaces. These numbers also rely on MediaWiki being correctly maintained and these numbers actually being updated. (Users who have performed an action in the last 30 days)
  • “link to other wikibases” are links extracted from sites by Addbot either via external links or specific properties that state they are links to other wikibases. (The code is not pretty, but gives us an initial view)

And summarized in words:

  • 264 Wikibases with some content that have been edited in the past 30 days
  • 194 Wikibases link in some way to other Wikibases
    • Excluding links to Wikidata and Commons, this number comes down to 35 (So Wikidata is very much the centre)

And of course, take all of this with a pinch of salt, these numbers are an initial stab at trying to have an overview of the ecosystem.

An updated web

My October post included some basic visualizations from the query service of wikibase.world.

However, it’s time to get a little more fancy and interactive. (As well as showing all wikibases, not just the linked ones)

Read more

Wikidata Map in 2024

This entry is part 16 of 17 in the series Wikidata Map

Another year on from the last generation of the Wikidata map, @tarrow and @outdooracorn spent some time in preparation for the Wikidata birthday to prepare a new map (see git commits).

The latest images have already been uploaded to Wikimedia Commons, and appear in the Wikidata map commons gallery.

In this post, I’ll have a look at what has changed in the past year that is visible from the map!

Read more

COVID-19 Wikipedia pageview spikes, 2019-2022

Back in 2019 at the start of the COVID-19 outbreak, Wikipedia saw large spikes in page views on COVID-19 related topics while people here hunting for information.

I briefly looked at some of the spikes in March 2020 using the easy-to-use pageview tool for Wikimedia sites. But the problem with viewing the spikes through this tool is that you can only look at 10 pages at a time on a single site, when in reality you’d want to look at many pages relating to a topic, across multiple sites at once.

I wrote a notebook to do just this, submitted it for privacy review, and I am finally getting around to putting some of those moving parts and visualizations in public view.

Methodology

It certainly isn’t perfect, but the representation of spikes is much more accurate than looking at a single Wikipedia or set of hand selected pages.

  1. Find statements on Wikidata that relate to COVID-19 items
  2. Find Wikipedia site links for these items
  3. Find previous names of these pages if they have been moved
  4. Lookup pageviews for all titles in the pageview_hourly dataset
  5. Compile into a gigantic table and make some graphs using plotly

I’ll come onto the details later, but first for the…

Graphics

All graphics generally show an initial peak in the run-up to the WHO declaring an international public health emergency (12 Feb 2020), and another peak starting prior to the WHO declaring a pandemic.

Be sure to have a look at the interactive views of each diagram to really see the details.

COVID-19 related Wikimedia pageviews (interactive view)

Read more

Wikidata Map in 2023

This entry is part 15 of 17 in the series Wikidata Map

It’s been 2 years since the 2021 Wikidata map. Yesterday I was sitting in the WMDE office and Lydia raised the point that we hadn’t made a map in quite some time (T331124).

Maps used to try to generate in a somewhat automated fashion, but the process was rewritten in 2021 and still needs to be run by hand with someone with access to the WMF analytics platform.

Thankfully the documentation of the updates still works perfectly, and the whole process of the map generation only took a few minutes!

Read more

Wikidata Map May – November 2019

This entry is part 13 of 17 in the series Wikidata Map

It’s time for another blog post in my Wikidata map series, this time comparing the item maps that were generated on the 13th May 2019 and 11th November 2019 (roughly 6 months). I’ll again be using Resemble.js to generate a difference image highlighting changed areas in pink, and breakdown the areas that have had the greatest change throughout the 6 month period. The full comparison image can be found here.

Differences in the Wikidata map highlights in pink for changes between May 2019 and November 2019

If you don’t know what Wikidata is, or what items are then give this page a read. This map shows all items that have a “coordinate location” as a light pixel on a black canvas. The more items with coordinates in a single pixel, the brighter that pixel. This map is generated using code that can be found here.

Read more

Covid-19 Wikipedia pageviews, a first look

World events often have a dramatic impact on online services. A past example would be the death of Michael Jackson which brought down Twitter and Wikipedia and made Google believe that they were under attack according to the BBC.

Events like the COVID-19 (Coronavirus) pandemic have less instantaneous affect but trends can still be seen to change. Cloudflare recently posted about some of the internet wide traffic changes due to the pandemic and various government announcements, quarantines and lockdowns.

Currently the main English Wikipedia article for the COVID-19 pandemic is receiving roughly 1.2 million page views per day (14 per second). This article has already gone through 4 different names over the past months, and the pageview rate continues to climb.

Wikipedia pageviews tool showing English Wikipedia COVID-19 pandemic article views up to 21 March 2020 (source)

Read more