Wikibase Phrase Entity, Viewing

This entry is part 7 of 7 in the series Wikibase Entities

In my previous post, we got to the point of being able to create a new Wikibase Entity, it is stored in the MediaWiki database as a page, however we can’t actually view it via any interface currently.

In this post, we will work through another set of code changes, tackling each issue as we see it arise, until we can see the entity represented in the various places that users might expect.

Viewing the page

The provided entity serialization is neither legacy nor current

When clicking on one of the links on Special:RecentChanges to a phrase page that we have created, we get our first error.

/wiki/Phrase:Phrase66900b01937842.29097733 MWContentSerializationException: The provided entity serialization is neither legacy nor current
from /var/www/html/w/extensions/Wikibase/lib/includes/Store/EntityContentDataCodec.php(253)Code language: JavaScript (javascript)

The full stack trace is a little large, but you can find it in a paste bin.

This error is very similar to an issue we saw in the creation blog post, but this time the codec class can not deserialize what we have stored in the database, as we have not registered a deserializer for phrases.

Adding a deserializer to the entity registration file is very simple:

Read more

Wikibase Phrase Entity, Creation

This entry is part 6 of 7 in the series Wikibase Entities

Finally, after a long lead up of discussing what an entity is, looking at some examples of entity extensions, and one extension that chose not to make use of the Wikibase Entity system & EntityDocument. What does it take to create a new type of data entity within Wikibase that implements the EntityDocument interface and makes use of the various integrations that have evolved over the past 10+ years?

I slapped together a very rough branch exploring this in 2022, but it’s hard to follow at best, and doesn’t really discuss any of the challenges that crop up along the way. This post, and those following are the redo, with much more context. And with any luck, it will work mostly as before (as Wikibase hasn’t changed much internally when it comes to how Entities are handled in the last 2 years)

If you want to follow along, you’ll need a development environment, and for that I would recommend the mwcli walkthrough that I wrote in the past weeks.

Where to start

I have a slight advantage here, as the closest thing that comes to documentation around how to add a new entity type to Wikibase is the documentation of the various fields that make up the entity registration system.

Beyond that, your only way in would likely be to start looking at one of the extensions that already provides an additional entity type, such as WikibaseMediaInfo, and the entity type registration that it makes. But each of these extensions come with their own complexity to muddle your view.

Read more

EntitySchema, and the entity flip-flop

This entry is part 5 of 7 in the series Wikibase Entities

The EntitySchema extension, previously called WikibaseSchema, has had an interesting life since its initial creation back in early 2019.

The main point this story is intended to highlight is that EntitySchema started off its planned life as an Entity within a Wikibase. As the development team started work on an initial version, it flipped away from an entity. And in continued development, it has slowly inched its way back towards perhaps being an Entity.

Background

As is noted in the first ADR of the extension (which was actually written in 2023), the team initially decided to try and develop the extension entirely separate from Wikibase

Although Entity Schemas relate to Wikibase entities by name and purpose, the implementation of the EntitySchema extension, at the time of this decision, is completely decoupled from Wikibase, and the concept of Entities that it adds to MediaWiki. Thus, a MediaWiki instance can theoretically operate with only the EntitySchema extension, and without the Wikibase extension installed.

Keeping EntitySchema separate from Wikibase, and the idea of an Entity it provides altogether, was a conscious decision to not marry its implementation to the inherent complexity of Wikibase itself. As well as an attempt to avoid overloading EntitySchema with unnecessary functionality so that its ongoing implementation could be done iteratively and in a more flexible, organic manner, to answer user’s needs as they are brought to us.

0001 Extend Entity Schema to support additional “traits” ADR

In a nutshell, this extension, and the developments and discussions about it over the past years (and that are still happening today), was one of the things that has led me to recently writing a series of blog posts about what I think an “entity” is from my perspective, as well as looking at some other entities, and the use of EntityDocument in the codebase.

Project kick-off

Internally within WMDE, the extension started off (having already been planned and discussed for some time) with a series of kick-off meetings in December 2018. The first of which was deemed to have too many open questions, hence the follow-up of a second. Ultimately, a team formed around the creation of the extension and this started further discussions.

Read more

Wikibase Repository development environment (mwcli)

This entry is part 4 of 7 in the series Wikibase Entities

Back in 2022, while working at Wikimedia Germany, I ran two sessions with people from the Wikibase Stakeholder Group, focused on Ecosystem Enablement.

These sessions were video recorded and documented in quite a lot of detail, but following through with the videos would probably lead to a bit of a drawn out experience, as they were focused around a workshop setting with participants following along.

  • Session 1, 2022-04-28: Using mwcli, loading extensions, understanding Mediawiki’s general extension mechanism (Video, Overview)
  • Session 2, 2022-05-24: Running your first extension, Wikibase stable interface policy, Mediawiki hooks, building a new API function (Video, Overview)

In this post, I will focus on the core steps required to get a MediaWiki and Wikibase Repository development environment setup in a few minutes with mwcli, and will serve as a basis for some blog posts that I will be writing in the future.

Getting mwcli

If you head to the home page of mwcli, you’ll see a link to an installation guide.

Read more

Lexeme and MediaInfo, implementing EntityDocument

This entry is part 3 of 7 in the series Wikibase Entities

As we continue the journey, looking at Entity and EntityDocument within Wikibase, another useful thing to look at are the third and fourth widely used (at least within the Wikimedia space) entity types for Wikibase.

Both of these entity types make use of the EntityDocument, with none of the old assumptions baked into the Entity base class that used to exist.

MediaWiki extensions

As these entity types were decoupled from the main body of Wikibase, they were developed as MediaWiki extensions. https://www.mediawiki.org/wiki/Extension:WikibaseMediaInfo and https://www.mediawiki.org/wiki/Extension:WikibaseLexeme

This was the easy choice at the time, and probably still makes perfect sense, as Wikibase itself is a MediaWiki extension, and there is already a common pattern of extensions extending extensions. This ultimately saves some work around coding an extension mechanism, though we should remember that ultimately the Wikibase codebase has free choice when it comes to choose how it can be extended.

Read more

Wikibase, from Entity to EntityDocument

This entry is part 2 of 7 in the series Wikibase Entities

The term document has already come up a few times while discussing what a Wikibase entity is, and if that should change (be that in name only, code or structures), including in my first post of this series.

Looking at the very first definition of entity in the duck duck go search that I performed 6 seconds ago, an entity is:

Something that exists as a particular and discrete unit.

The American Heritage® Dictionary of the English Language, 5th Edition

At the most basic level, it’s fairly straightforward to say that a Wikibase doesn’t hold the actual entities (such as a type of tree), rather data about said entities.

And in a nutshell, this data is collected within a document.

Image from “What is the semantic web” by onotext.com

Quoting a few choice people again, before diving deeper into this topic…

The “entities” in the Wikibase base are not Entities. They are descriptions of entities. The entity is the thing in the world not the data we have about it, even tough colloquially, we don’t make the distinction. But we have separate URIs for the thing and the description in the abstract and for specific renderings.
I think that’s important to mention when discussing what an entity “is”.

Daniel Kinzler in conversation, June 2024

The data model chose to use the term “Entity” for the top-level Thing/class in the hierarchy of the data model. But in reality, a better term would have been “Document” or “Record”. In general, the confusion is often due simply to folks that are more familiar with one of the domains than the other, between OOP Objects and Semantic Web Objects.

Thad Guidry in a comment, June 2024

Read more

Wikibase: What is an entity?

This entry is part 1 of 7 in the series Wikibase Entities

I left the Wikidata and Wikibase teams roughly a year ago, and at the time there were some long and deep discussions going on inside the team trying to define what an entity was, and what should and should not be an entity.

At the recent Hackathon in Tallinn, this topic resurfaced to me, as current and previous members of the Wikidata and Wikibase teams were in attendance, along with myself.

I have opinions, others have opinions, and feel that a short blog post summarizing the currently publicly written details, as well as some of the more on point things I have heard people say may help further discussion, or perhaps bring it to some kind of conclusion.

What I actually found when pulling the various written details together is they mostly describe what I would say is the ideal path forward without rewriting the world (of Wikibase), but it’s taken me a while to sit back, relax, and actually reread all the things that we have written over the years.

Read more