EntitySchema, and the entity flip-flop

This entry is part 5 of 7 in the series Wikibase Entities

The EntitySchema extension, previously called WikibaseSchema, has had an interesting life since its initial creation back in early 2019.

The main point this story is intended to highlight is that EntitySchema started off its planned life as an Entity within a Wikibase. As the development team started work on an initial version, it flipped away from an entity. And in continued development, it has slowly inched its way back towards perhaps being an Entity.

Background

As is noted in the first ADR of the extension (which was actually written in 2023), the team initially decided to try and develop the extension entirely separate from Wikibase

Although Entity Schemas relate to Wikibase entities by name and purpose, the implementation of the EntitySchema extension, at the time of this decision, is completely decoupled from Wikibase, and the concept of Entities that it adds to MediaWiki. Thus, a MediaWiki instance can theoretically operate with only the EntitySchema extension, and without the Wikibase extension installed.

Keeping EntitySchema separate from Wikibase, and the idea of an Entity it provides altogether, was a conscious decision to not marry its implementation to the inherent complexity of Wikibase itself. As well as an attempt to avoid overloading EntitySchema with unnecessary functionality so that its ongoing implementation could be done iteratively and in a more flexible, organic manner, to answer user’s needs as they are brought to us.

0001 Extend Entity Schema to support additional “traits” ADR

In a nutshell, this extension, and the developments and discussions about it over the past years (and that are still happening today), was one of the things that has led me to recently writing a series of blog posts about what I think an “entity” is from my perspective, as well as looking at some other entities, and the use of EntityDocument in the codebase.

Project kick-off

Internally within WMDE, the extension started off (having already been planned and discussed for some time) with a series of kick-off meetings in December 2018. The first of which was deemed to have too many open questions, hence the follow-up of a second. Ultimately, a team formed around the creation of the extension and this started further discussions.

Read more

Lexeme and MediaInfo, implementing EntityDocument

This entry is part 3 of 7 in the series Wikibase Entities

As we continue the journey, looking at Entity and EntityDocument within Wikibase, another useful thing to look at are the third and fourth widely used (at least within the Wikimedia space) entity types for Wikibase.

Both of these entity types make use of the EntityDocument, with none of the old assumptions baked into the Entity base class that used to exist.

MediaWiki extensions

As these entity types were decoupled from the main body of Wikibase, they were developed as MediaWiki extensions. https://www.mediawiki.org/wiki/Extension:WikibaseMediaInfo and https://www.mediawiki.org/wiki/Extension:WikibaseLexeme

This was the easy choice at the time, and probably still makes perfect sense, as Wikibase itself is a MediaWiki extension, and there is already a common pattern of extensions extending extensions. This ultimately saves some work around coding an extension mechanism, though we should remember that ultimately the Wikibase codebase has free choice when it comes to choose how it can be extended.

Read more

Wikibase, from Entity to EntityDocument

This entry is part 2 of 7 in the series Wikibase Entities

The term document has already come up a few times while discussing what a Wikibase entity is, and if that should change (be that in name only, code or structures), including in my first post of this series.

Looking at the very first definition of entity in the duck duck go search that I performed 6 seconds ago, an entity is:

Something that exists as a particular and discrete unit.

The American Heritage® Dictionary of the English Language, 5th Edition

At the most basic level, it’s fairly straightforward to say that a Wikibase doesn’t hold the actual entities (such as a type of tree), rather data about said entities.

And in a nutshell, this data is collected within a document.

Image from “What is the semantic web” by onotext.com

Quoting a few choice people again, before diving deeper into this topic…

The “entities” in the Wikibase base are not Entities. They are descriptions of entities. The entity is the thing in the world not the data we have about it, even tough colloquially, we don’t make the distinction. But we have separate URIs for the thing and the description in the abstract and for specific renderings.
I think that’s important to mention when discussing what an entity “is”.

Daniel Kinzler in conversation, June 2024

The data model chose to use the term “Entity” for the top-level Thing/class in the hierarchy of the data model. But in reality, a better term would have been “Document” or “Record”. In general, the confusion is often due simply to folks that are more familiar with one of the domains than the other, between OOP Objects and Semantic Web Objects.

Thad Guidry in a comment, June 2024

Read more