Editing Grokipedia, a first look

As a long time editor and developer in the Wikipedia and Wikimedia space, I’m of course sceptical about what Grokipedia is trying to be, and if it stands any chance of success. it may struggle to deliver the resilience, transparency, and community processes that keep projects like Wikipedia thriving, and in the early weeks the untouchable AI generated content was certainly not going to work moving forward.

However, in the last week or so editing became an option, hidden behind Grok as a safeguard against abuse.

I thought I’d have a look at trying to edit a few of areas of content to see what the experience is like, and capture some of the good and bad points.

In no particular order…

Broken link formatting

A fix attempt

The Donald Trump articles has some broken formatting, which looks like an incorrectly parsed or formatted Markdown link that is now just showing in the HTML of the page. For posterity, I captured a copy of this version of the page on archive.ph, but here is a snapshot of how it appears.

Read more

How much is “Wikibase Suite” (and deploy) used

This entry is part 3 of 3 in the series Wikibase ecosystem

At the start of this year, I spent some time visualizing the Wikibase ecosystem by making use of the data that has been collected on wikibase.world. As part of that, I tried to focus in on Wikibase Suite, trying to determine how many possible installations there were making use of the container images, and or the newer Wikibase Suite Deploy thing.

I came to the number 33, based on the fact that there were this many sites online on wikibase.world that have an exact match to the MediaWiki versions that have been released as part of the Wikibase Suite container images. And in all cases, this would be an overestimate, given that these versions would also be installed by some not using the images, so the likely number would be closer to ~16…

And of the 33 sites that might possibly be using “suite” as they are on the same version at least, probably 50% are installed via other means, so the “suite” installations probably account for ~16 of the wikibases in wikibase.world at a guesstimate, with ~50 using other methods and 711 using wikibase.cloud.

9th December 2025 Edit

Apparently this post attracted some attention, and I want to make it clear that “Wikibase suite” here is specifically talking about the packaging up of the container images / docker images into some single installable reusable magic component, and support system around it.

I personally believe the container images themselves are a great asset, and the idea of a suite of recommended extensions and applications that should be delivered alongside a Wikibase is also an asset that the ecosystem does need.

Also looking at the “Wikibase Suite Team” board for “Sprint 9 (Nov 25 – Dec 9)” federation is a key topic that is currently being worked on, and tasks like T404547 [Self-Hosting Ops] Define metric for ease of self-hosting show that the team is / has moved away from only thinking about the magic packaged layer.

Updated count

10 months on from this first look, while visiting the Wikimedia Germany offices, I found the need once again to try to come up with concrete numbers in terms of the users of Wikibase Suite, where my general motive would be to convince WMDE that resources are better spent in other places, such as supporting the underlying software, not just this fancy wrapper on top.

And with this new analysis, my revised number is roughly 18, of which 9 are possibly active, and 2 are likely lost to bot spam. You can find the list via this rather lengthy wikibase.world query.

Read more

Wikidata Map in 2025

This entry is part 17 of 17 in the series Wikidata Map

Another year, another map, and another Birthday for Wikidata. Last generated in 2024 by @tarrow and @outdooracorn, this year I have put the work in just ahead of the 13th Wikidata birthday to have a look at what’s changed in terms of items with coordinates this past year on Wikidata.

And here it is!

But really you need to look at the diff between previous years to see what has changed!

Read more

Slop in, craft out?

Earlier today, I sent this absolutely perfectly crafted piece of slop into GitHub Copilot…

Right, but i want thje patche sot be / and /* always

And as I already expected, due to using these LLM based coding agents and assistants continually throughout their evolution, the resulting change was exactly what I wanted, despite the poor instructions.

Now, I’m sure there is actually some difference, and likely this depends on the relevance of the typoed areas, and how often such typos might also appear in training data.

Why is this, you might ask?

Read more

Wikidata, instance of and subclass of through time (P31 & P279)

Last month I looked at all Wikimedia Commons revisions and managed to generate some data and graphs for the usage of depicts statements since they were introduced on the project.

This month, I have applied the same analysis on Wikidata but looking at instance of and subclasses of items. A slightly bigger data set, however essentially the same process.

This will enable easy updating, of various pie charts that have been published over the years, such as

In future, this could be easily adapted to show per Wikipedia project graphs, such as those that are currently at Wikidata:Statistics/Wikipedia

Method

The details of the method can be seen in code in my previous post about depicts statements, and this mostly stays the same.

In words:

  • Look at every revision of Wikidata ever
  • Parse the JSON to determine what values there are for P31 and P279 for each revision
  • Find the latest revision of each item in each given month, and thus find the state of all items in that month
  • Plot the data by number of items that are P31 or P279 of each value item

There are some minor defects to this logic currently that could be cleaned up with future iterations:

  • Deleted items will continue being counted, as I don’t consider the point items are deleted
  • Things will be double counted in this data, as 1 item may have multiple P31 and P279 values, and I don’t try to join these into higher level concept at all

We make an OTHER and UNALLOCATED count as part of the final data summarization. OTHER accounts for things that have not made it into the top 20 items by count, and UNALLOCATED means that we didn’t have a P31 or P279 value in the latest revision.

2025

For August 2025 (or at least part way through it), this is the current state of Wikidata per the above method.

You can now find a PNG of this pie chart on Wikimedia Commons https://commons.wikimedia.org/wiki/File:Wikidata_P31_%26_P279_analysis_August_2025.png

Read more

Online RDS column type modification, using pt-online-schema-change from EC2

I’m using percona-tools to do an online schema modification today, and thought I would document the process, especially as even the installation guides seem to be badly linked, out of date, and did not work out of the box…

EC2 instance

This is all running on a t3.micro EC2 instance with Ubuntu. I deliberately didn’t go with Amazon Linux, as I wanted to be able to use apt. For simplicities’ sake, I’ll be using the EC2 Instance Connect feature, which allows connection to a session in a web browser! (although the copy and paste via this is annoying)

This instance of course also needs access to your MySQL server, in this case an RDS instance. So I’ll go ahead and add it to the security group.

Percona toolkit

Percona Toolkit is a powerful open-source collection of advanced command-line tools designed to help MySQL and MariaDB DBAs perform tasks like online schema changes, replication troubleshooting, and data auditing safely and efficiently.

It’s used a Wikimedia for online database migrations (the reason I know about it), however I have never actually used it myself!

Read more

What is Wikibase “Federated Properties” in 2025

I recently wrote a post looking at the history of the Wikibase “Federated Properties” feature. While at Wikimania 2025 the topic of federation came up a few times, particularly given the current discussions ongoing on the Wikidata project chat page including discussions about wikicite, and the recent Wikidata graph split.

All the code for the “Federated Properties” feature still exists in Wikibase code, despite a ticket being open on phabricator to potentially delete it. And it turns out that the configuration for it still exists on wikibase.cloud too, where the feature was initially presented to the communities to try out.

So with a little bit of sneaky “hacking”, I can try to summarize the current / final state of the “Federated Properties” feature, after development during the MVP stopped some years ago.

This also means you can still try out the feature on your own wiki using the setting.

$wgWBRepoSettings['federatedPropertiesEnabled'] = true;

Creating a local property

Firstly, we need a property, and the creation workflow is exactly the same as on a normal Wikibase.

Read more

What was Wikibase “Federated Properties”

The “Federated Properties” feature allows / allowed a local Wikibase instance to access and utilise properties directly from a remote Wikibase, primarily Wikidata. Its primary purpose is to enable partial federation between a local Wikibase and Wikidata, broadening the base of available data without needing to create a property set from scratch.

I’m split between using the present and past tense here, as all of this code still exists within the Wikibase extension, however no one has used it since 2022, and it certainly doesn’t seem to be on the short or medium term (or maybe even long term) roadmaps.

This overview comes from the Wikibase – Federated Properties Phabricator project, which I’ll quote the whole of here for prosperity.

Federated Properties v2 (2021)
An initiative to give users the ability to access remote properties from their local Wikibase and use them in combination with custom local properties. The primary use case is enabling partial federation between a Wikibase and Wikidata. This version of the feature will allow you to:

  • Opt-in to use Wikidata’s properties in addition to your own custom local properties
  • Create and view statements about local entities that contain both local and federated properties
  • Query your Wikibase using both local and federated properties

Federated Properties v1 (2020-2021)
An initiative to give users the ability to access remote properties from their local Wikibase (no local properties were possible in this MVP). This version was launched in the Wikibase Spring Release in May 2021.

As far as I remember, the project died with v2, and I don’t even recall if v2 really saw the light of day outside WMDE internal testing and or hidden testing on wikibase.cloud.

Read more

Wikimedia Commons Depicts statements over time

Wikimedia Commons now uses Structured Data on Commons (SDC) to make media information multilingual and machine-readable. A core part of SDC is the ‘depicts’ statement (P180), which identifies items clearly visible in a file. Depicts statements are crucial for MediaSearch, enabling it to find relevant results in any language by using Wikidata labels, as well as having pre precise definition and structure than the existing category structures.

SDC functionalities began to roll out in 2019. Multilingual file captions were introduced early that year, enabling broader accessibility, followed by the ability to add depicts statements directly on file pages and through the UploadWizard.

Although there are numbers floating around showing a general increase in usage of structured data on Commons, there didn’t seem to be any concrete numbers around the growth in use of depicts statements.

I was particularly interested in this, as must tool WikiCrowd is steadily becoming a more and more efficient way of adding these statements en masse. So I decided to see what data I could come up with.

Read more

Easy WSL Windows path switching alias

I have been primarily developing on WSL for some years now, and still love the combination in terms of all around flexibility. When primarily working on Linux based or focused applications, everything is lovely! However, I’m spending more time straying into the land of hardware, USB devices, and custom IDEs and debug interfaces that are … Read more