Review of the big Interwiki link migration

June 13, 2015 0 By addshore

Wikidata was launched on 30 October 2012 and was the first new project of the Wikimedia Foundation since 2006. The first phase enabled items to be created and filled with basic information: a label – a name or title, aliases – alternative terms for the label, a description, and links to articles about the topic in all the various language editions of Wikipedia.

On 14 January 2013, the Hungarian Wikipedia became the first to enable the provision of interlanguage links via Wikidata. This functionality was slowly enabled on more sites until it was enabled on all Wikipedias on the 6th March.

The side bar that these interlanguage links are used to generate can be seen to the right.

Before Wikidata

Before Wikidata phase 1 came to be every single language version of an article contained a manually / bot maintained list of interwiki links at the bottom. These links had a habit of conflicting with one another due to the way that they were maintained. An example of the list maintained in wikitext can be seen below for https://fr.wikipedia.org/wiki/La_Nuit_des_rois_(homonymie). The interwiki links are at the bottom of the content in the style [[LANG:TITLE]].

As said above this list is maintained on every article, by the looks of things this article has 7 other language versions, so 8 lists of links in total. So say you want to add a new article in another language, you should really add said article to all 8 lists, meaning 8 additional edits. Alternatively you could wait for bots to notice that the articles are related and add the links in all of the relevant places.

As people generally didn’t care about these lists on other wikis, or even know about them the bots took over. See below…

Above is the history for the Uzbeck language article for January 31st, 31-yanvar. Only 2 of the 34 edits shown above were made by people, the rest by bots maintaining the list of interwiki links, and this screenshot doesn’t even show the whole history for the article (see here).

The Migration through bots

The whole migration was basically carried out by a fleet of bots. These bots basically did one of the following things:

  • Find an interwiki link on an article and add it to Wikidata (leaving the interwiki link on the article).
  • Find an interwiki link on an article , add it to Wikidata and remove it from the article.
  • Find an interwiki link on an article that is already stored in Wikidata and remove it from the article.

My bot, Addbot, focused on the last one of these trying to remove redundant data. Due to the simplicity of this task the bot made over 15 million edits through the switch on of Phase 1.

An example interwiki link migration edit can be seen below or found here.

Other notable bots in the migration effort include: Legobot and EmausBot.

Migration conflict?

I leave a question mark above as there was not really any conflict with the migration, simply surprise. Bots helping to migrate data to Wikidata were regularly blocked on various projects in the first week or so. Although the Wikidata team, the community and the bot operators all tried to get the message out there about the migration before it happened, some people of course did not hear!

Most of the projects that the bots needed to edit fell under the Wikimedia global bot policy, other projects needed individual consultation and approval for example on the German Wikipedia.

Impact

The biggest visible impact on all Wikimedia projects was the decrease in bot edits post migration. This was due to hundreds of interwiki bots no longer needing to run and the link lists being maintained in a single place.

The graph below provided by stats.wikimedia.org shows a large spike in edits in early 2013, this was the flood of edits to remove the interwiki links from articles that were already provided by Wikidata. This then drops after the main part of the migration.

If you focus on the green line you can see that post migration the number of bot edits fell to below half of the number prior to the migration.

 

Further reading

Even good bots fight: The case of Wikipedia – Investigation of conflicts of content editing bots on Wikipedia that notes the lead up to the Wikidata interwiki switch on and the increase in bot activity up to that point.

Our data cover a period of the evolution of Wikipedia when bot activity was growing. Evidence suggests that this period suddenly ended in 2013 (http://stats.wikimedia.org/EN/PlotsPngEditHistoryTop.htm). This decline occurred because at the beginning of 2013 many language editions of Wikipedia started to provide inter-language links via Wikidata, which is a collaboratively edited knowledge base intended to support Wikipedia. Since our results were largely dictated by inter-language bots, we believe that the conflict we observed on Wikipedia no longer occurs today.