As a long time editor and developer in the Wikipedia and Wikimedia space, I’m of course sceptical about what Grokipedia is trying to be, and if it stands any chance of success. it may struggle to deliver the resilience, transparency, and community processes that keep projects like Wikipedia thriving, and in the early weeks the untouchable AI generated content was certainly not going to work moving forward.
However, in the last week or so editing became an option, hidden behind Grok as a safeguard against abuse.
I thought I’d have a look at trying to edit a few of areas of content to see what the experience is like, and capture some of the good and bad points.
In no particular order…
Broken link formatting
A fix attempt
The Donald Trump articles has some broken formatting, which looks like an incorrectly parsed or formatted Markdown link that is now just showing in the HTML of the page. For posterity, I captured a copy of this version of the page on archive.ph, but here is a snapshot of how it appears.
I came to the number 33, based on the fact that there were this many sites online on wikibase.world that have an exact match to the MediaWiki versions that have been released as part of the Wikibase Suite container images. And in all cases, this would be an overestimate, given that these versions would also be installed by some not using the images, so the likely number would be closer to ~16…
And of the 33 sites that might possibly be using “suite” as they are on the same version at least, probably 50% are installed via other means, so the “suite” installations probably account for ~16 of the wikibases in wikibase.world at a guesstimate, with ~50 using other methods and 711 using wikibase.cloud.
9th December 2025 Edit
Apparently this post attracted some attention, and I want to make it clear that “Wikibase suite” here is specifically talking about the packaging up of the container images / docker images into some single installable reusable magic component, and support system around it.
I personally believe the container images themselves are a great asset, and the idea of a suite of recommended extensions and applications that should be delivered alongside a Wikibase is also an asset that the ecosystem does need.
10 months on from this first look, while visiting the Wikimedia Germany offices, I found the need once again to try to come up with concrete numbers in terms of the users of Wikibase Suite, where my general motive would be to convince WMDE that resources are better spent in other places, such as supporting the underlying software, not just this fancy wrapper on top.
And with this new analysis, my revised number is roughly 18, of which 9 are possibly active, and 2 are likely lost to bot spam. You can find the list via this rather lengthy wikibase.world query.
This entry is part 17 of 17 in the series Wikidata Map
Another year, another map, and another Birthday for Wikidata. Last generated in 2024 by @tarrow and @outdooracorn, this year I have put the work in just ahead of the 13th Wikidata birthday to have a look at what’s changed in terms of items with coordinates this past year on Wikidata.
And here it is!
But really you need to look at the diff between previous years to see what has changed!
Earlier today, I sent this absolutely perfectly crafted piece of slop into GitHub Copilot…
Right, but i want thje patche sot be / and /* always
And as I already expected, due to using these LLM based coding agents and assistants continually throughout their evolution, the resulting change was exactly what I wanted, despite the poor instructions.
Now, I’m sure there is actually some difference, and likely this depends on the relevance of the typoed areas, and how often such typos might also appear in training data.
This month, I have applied the same analysis on Wikidata but looking at instance of and subclasses of items. A slightly bigger data set, however essentially the same process.
This will enable easy updating, of various pie charts that have been published over the years, such as
Parse the JSON to determine what values there are for P31 and P279 for each revision
Find the latest revision of each item in each given month, and thus find the state of all items in that month
Plot the data by number of items that are P31 or P279 of each value item
There are some minor defects to this logic currently that could be cleaned up with future iterations:
Deleted items will continue being counted, as I don’t consider the point items are deleted
Things will be double counted in this data, as 1 item may have multiple P31 and P279 values, and I don’t try to join these into higher level concept at all
We make an OTHER and UNALLOCATED count as part of the final data summarization. OTHER accounts for things that have not made it into the top 20 items by count, and UNALLOCATED means that we didn’t have a P31 or P279 value in the latest revision.
2025
For August 2025 (or at least part way through it), this is the current state of Wikidata per the above method.
I’m using percona-tools to do an online schema modification today, and thought I would document the process, especially as even the installation guides seem to be badly linked, out of date, and did not work out of the box…
EC2 instance
This is all running on a t3.micro EC2 instance with Ubuntu. I deliberately didn’t go with Amazon Linux, as I wanted to be able to use apt. For simplicities’ sake, I’ll be using the EC2 Instance Connect feature, which allows connection to a session in a web browser! (although the copy and paste via this is annoying)
This instance of course also needs access to your MySQL server, in this case an RDS instance. So I’ll go ahead and add it to the security group.
Percona toolkit
Percona Toolkit is a powerful open-source collection of advanced command-line tools designed to help MySQL and MariaDB DBAs perform tasks like online schema changes, replication troubleshooting, and data auditing safely and efficiently.
It’s used a Wikimedia for online database migrations (the reason I know about it), however I have never actually used it myself!
I recently wrote a post looking at the history of the Wikibase “Federated Properties” feature. While at Wikimania 2025 the topic of federation came up a few times, particularly given the current discussions ongoing on the Wikidata project chat page including discussions about wikicite, and the recent Wikidata graph split.
All the code for the “Federated Properties” feature still exists in Wikibase code, despite a ticket being open on phabricator to potentially delete it. And it turns out that the configuration for it still exists on wikibase.cloud too, where the feature was initially presented to the communities to try out.
So with a little bit of sneaky “hacking”, I can try to summarize the current / final state of the “Federated Properties” feature, after development during the MVP stopped some years ago.
This also means you can still try out the feature on your own wiki using the setting.
The “Federated Properties” feature allows / allowed a local Wikibase instance to access and utilise properties directly from a remote Wikibase, primarily Wikidata. Its primary purpose is to enable partial federation between a local Wikibase and Wikidata, broadening the base of available data without needing to create a property set from scratch.
I’m split between using the present and past tense here, as all of this code still exists within the Wikibase extension, however no one has used it since 2022, and it certainly doesn’t seem to be on the short or medium term (or maybe even long term) roadmaps.
This overview comes from the Wikibase – Federated Properties Phabricator project, which I’ll quote the whole of here for prosperity.
Federated Properties v2 (2021) An initiative to give users the ability to access remote properties from their local Wikibase and use them in combination with custom local properties. The primary use case is enabling partial federation between a Wikibase and Wikidata. This version of the feature will allow you to:
Opt-in to use Wikidata’s properties in addition to your own custom local properties
Create and view statements about local entities that contain both local and federated properties
Query your Wikibase using both local and federated properties
Federated Properties v1 (2020-2021) An initiative to give users the ability to access remote properties from their local Wikibase (no local properties were possible in this MVP). This version was launched in the Wikibase Spring Release in May 2021.
As far as I remember, the project died with v2, and I don’t even recall if v2 really saw the light of day outside WMDE internal testing and or hidden testing on wikibase.cloud.
Wikimedia Commons now uses Structured Data on Commons (SDC) to make media information multilingual and machine-readable. A core part of SDC is the ‘depicts’ statement (P180), which identifies items clearly visible in a file. Depicts statements are crucial for MediaSearch, enabling it to find relevant results in any language by using Wikidata labels, as well as having pre precise definition and structure than the existing category structures.
SDC functionalities began to roll out in 2019. Multilingual file captions were introduced early that year, enabling broader accessibility, followed by the ability to add depicts statements directly on file pages and through the UploadWizard.
Although there are numbers floating around showing a general increase in usage of structured data on Commons, there didn’t seem to be any concrete numbers around the growth in use of depicts statements.
I was particularly interested in this, as must tool WikiCrowd is steadily becoming a more and more efficient way of adding these statements en masse. So I decided to see what data I could come up with.
I have been primarily developing on WSL for some years now, and still love the combination in terms of all around flexibility. When primarily working on Linux based or focused applications, everything is lovely! However, I’m spending more time straying into the land of hardware, USB devices, and custom IDEs and debug interfaces that are … Read more