Wikibase a history
I have had the pleasure of being part of the Wikibase journey one way or another since 2013 when I first joined Wikimedia Germany to work on Wikidata. That long-running relation to the project should put me in a fairly good position to give a high-level overview of the history, from both a technical and higher-level perspective. So here it goes.
For this reason, a lot of the early timeline is quite heavy on the Wikidata side. There are certainly some key points missing, if you think they are worthy of mentioning then leave a comment or reach out!
At Wikimania 2005 there was a series of talks on “Semantic web”. One of these was Wikipedia and the Semantic Web – The Missing Links, and this Wikimania lead to the creation of Semantic MediaWiki.
The WikiProject “Semantic MediaWiki” provides a common platform for discussing extensions of the MediaWiki software that allow for simple, machine-based processing of Wiki-content. This usually requires some form of “semantic annotation,” but the special Wiki environment and the multitude of envisaged applications impose a number of additional requirements.Semantic MediaWiki at 22:29, 2 January 2006
The initial version of Semantic MediaWiki was released in late 2005 (version 0.1), with 4.0.0 being released at the start of 2022.
If you read through the Wikimania and connected resources carefully, you’ll find a reference to Wikidata already, though at this point Wikidata is only a project proposal.
Wikidata is a proposed wiki-like database for various types of content. This project as proposed here requires significant changes to the software (or possibly a completely new software) but has the potential to centrally store and manage data from all Wikimedia projects, and to radically expand the range of content that can be built using wiki principles.Wikidata/Archive/Wikidata/historical at 19:16, 30 November 2005
And ultimately the Wikidata project lead to the creation of the Wikibase software.
There was certainly some work behind the scenes between 2005 and 2012 but most of this seemingly doesn’t have a super public record. There certainly will have been work done on the project proposal, and ongoing discussions with the Wikimedia Foundation about the project.
In March 2012, the Wikimedia Foundation and Wikimedia Germany jointly announced “The Wikipedia data revolution”.
- Wikimedia Foundation: The Wikipedia data revolution
- Wikimedia Germany: Data Revolution for Wikipedia (archive.org press release)
Wikimedia Deutschland, the German chapter of the Wikimedia movement, and the Wikimedia Foundation are proud to announce Wikidata, a collaboratively edited database of the world’s knowledge and the first new Wikimedia project since 2006.The Wikipedia data revolution (Wikimedia Foundation)
If you want a video introduction from 2012 take a look at this video from SMWCon Fall 2012 in a session called “Wikidata: Semantic Wikipedia”.
If you want to know the original goals of the Wikidata project, and thus the Wikibase software, take a look here. (Maybe I should write some of this up soon)…
Also in April 2012, Jeroen De Dauw created the initial content of the first Wikibase extension page on mediawiki.org. And thus Wikibase was born.
- client, repo, lib: The three sub extensions that have been a part of the Wikibase git repository since the eary days. One for Wikidata.org, one for Wikipedias, and one containing shared code.
- terms (labels, descriptions aliases): So that concepts can be identified in language
- sitelinks: Connections from Wikibase to other MediaWiki sites such as Wikipedia
- Namespaces for “data”, now “items”, properties, and queries.
In 2013 I joined the team 🎉🎉🎉. And there are some things that I distinctly remember:
- There were continued disucssions around how to get started with a query service
- Multiple libraries were split out to be reusable outside of the main Wikibase codebase such as DataValues, DataTypes, DataModel (Some of these were created as MediaWiki extensions before later being turned into libraries.)
- We were still doing a phased role out of Wikidata to various Wikimedia projects (phase 1 being sitelinks)
- I personally remember working on the Wikbiase Action API soomewhat, adding item merge functioanlity.
The main hidden gem that is worth pointing out about 2013 developments is that some portion of time was spent developing WikibaseQuery, WikibaseQueryEngine and WikibaseDatabase that never saw the light of day. These were primarily built to meet the first usecase of “Query by one property and one value“.
It may seem insignificant, but 2014 saw the first version of the wikiba.se website.
Wikibase is a collection of applications and libraries for creating, managing and sharing structured data. It is an open source project, and everyone is welcome to join in development.wikiba.se in 2014
JSON dumps of Wikidata were created for the first time this year.
The various query related extensions developed were archived, as the Wikimedia Foundation had a need for both simple and complex queries for a project called WikiGrok. Work kicked off at the foundation looking into Wikibase indexing needs and goals.
The news of the year was certainly that the Wikidata Query Service was launched by the Discovery team at the Wikimedia Foundation. This was the SPARQL and blazegraph implementation that we have now been using next to Wikibase for the past 7 years.
A side note here is that Titan was originally evaluated, but looks like it was ditched as it, and the team was bought by DataStax to build a new graph database (Ironically this happened with blaze graph a few years later).
The SPARQL endpoint also saw the completion of the RDF mapping for Wikibase, so now we have stable RDF output.
Generally speaking, the Wikibase extension itself looks very similar to the early years, but extensions such as WikibaseQualityConstraints were developed and deployed to Wikidata.
Wikibase code docs are now built to doc.wikimedia.org (patch).
I’m sure other things happened this year, but things really start to pick up in 2017! ;)
Wikibase docker images saw the light of day to try and make Wikibase easier to get started with.
I feel that this really was a springboard enabling many more folks to try out Wikibase for their own projects locally, and also to run production instances.
Code wise, the “data-access” component appeared for the first time in Wikibase.git.
On 23-25 April 2018, a “Workshop on harnessing open data for Monitoring and Evaluation” is taking place in Antwerp (Q12892), focused on using Wikibase (Q16354758) instances federated with Wikidata (Q2013) in the context of research assessment (Q51844619).Wikidata:WikiProject Wikidata for research/Meetups/2018-04-23-25-Antwerpen
This round of workshops showed a real momentum increase around interest in Wikibase. At this point, although there were technical developments ongoing on the Wikibase software, these were still all primarily driven from a Wikidata perspective.
A first Wikibase Ecosystem strategy paper was published. At a high level this said “Wikibase powers a thriving linked open data web that is the backbone of free and open knowledge”, looking at some key areas:
- Focus on enabling connections between data and people
- Partner with the main players in their field, utilize network effects and branch out
- Leverage mandates to open up data
- Maximize the competitive advantage gained via Wikidata
Things start getting a little easier here, as Envel Le Hir has started collecting yearly summaries of Wikibase, such as “Wikibase Yearly Summary 2020“. I highly recommend reading these for a full overview, but I’ll extract some key points here.
- The first Product Manager for Wikibase was hired by Wikimedia Germany.
- There was the first online meeting of the Wikibase Community User Group.
- WBStack, the first Wikibase as a service, was open sourced.
- Semantic Wikibase was released, which is a connection between Semantic MediaWiki and Wikibase.
- Local Media Support for Wikibase was developed by Professional.Wiki
- WikibaseManifest was created by Wikimedia Germany
Code wise the introduction of “packages” in Wikibase.git happened!
This year Wikibase got its own all-important Twitter account. More and more workshops and projects around Wikibase were created, including a series of working hours around WBStack. Great projects exposing user needs were created such as RaiseWikibase. Federated properties, blog posts, WikidataCon 2021 and more.
The Wikibase stakeholder group is thriving with 17 organizational members, and 26 individual members. Institutional requirements have been collected and presented, and the group even has a budget to work with, and also a Twitter account!
- Empower knowledge curators to share their data: Increase the number and diversity of Wikibases that can eventually be connected to the LOD web.
- Ecosystem enablement: Enable an ecosystem of extensions as well as tools and custom interfaces based on WB APIs to emerge around Wikibase, extending the functionality of the software for more use cases.
- Connect data across technological & institutional barriers: Ensure Wikibases can connect more deeply with each other and Wikidata to form an LOD web
Code wise some of the libraries that were split out of Wikibase.git back in 2013 were moved back into the code base to be managed as a mono repo.
It’s only February, and the next thing on the cards for Wikibase is the Wikibase.cloud offering by Wikimedia Deutschland to replace wbstack.com.
Lots still to happen here, as I am writing this in February :)