wikibase-docker, Mediawiki & Wikibase update

Today on the Wikibase Community User Group Telegram chat I noticed some people discussing issues with upgrading Mediawiki and Wikibase using the docker images provided for Wikibase.

As the wikibase-registry is currently only running Mediawiki 1.30 I should probably update it to 1.31, which is the next long term stable release.

This blog post was written as I performed the update and is yet to be proofread, so expect some typos. I hope it can help those that were chatting on Telegram today.

Starting state

Documentation

There is a small amount of documentation in the wikibase docker image README file that talks about upgrading, but this simply tells you to run update.php.

Update.php has its own documentation on mediawiki.org.
None of this helps you piece everything together for the docker world.

Installation

The installation creation process is documented in this blog post, and some customization regarding LocalSettings and extensions was covered here.
The current state of the docker-compose file can be seen below with private details redacted.

This docker-compose files is found in /root/wikibase-registry on the server hosting the installation. (Yes I know that’s a dumb place, but that’s not the point of this post)

Read more

A firewall is blocking sharing between Windows and the containers – Docker

I recently encountered this error while trying to run one of my docker setups. I have encountered errors like this before and it has always ended up being related to docker and sharing my drives to the linux VM that actually runs my containers. Checking the shared drives menu of the docker UI everything seemed … Read more

2018 Year Review

This entry is part 2 of 7 in the series Year Reviews

12,374 page views (up from 7992) 8,578 visitors (up from 5250) 24 posts (up from 4) 28 comments (up from 13) Top 5 posts by page views in 2018: Guzzle 6 retry middleware, (still #1) Add Exif data back to Facebook images, (up from #4) Mislead by PHPUnit at() method, (down from #2) From 0 … Read more

Wikidata Architecture Overview (diagrams)

Over the years diagrams have appeared in a variety of forms covering various areas of the architecture of Wikidata. Now, as the current tech lead for Wikidata it is my turn.

Wikidata has slowly become a more and more complex system, including multiple extensions, services and storage backends. Those of us that work with it on a day to day basis have a pretty good idea of the full system, but it can be challenging for others to get up to speed. Hence, diagrams!

All diagrams can currently be found on Wikimedia Commons using this search, and are released under CC-BY-SA 4.0. The layout of the diagrams with extra whitespace is intended to allow easy comparison of diagrams that feature the same elements.

High level overview

High level overview of the Wikidata architecture

This overview shows the Wikidata website, running Mediawiki with the Wikibase extension in the left blue box. Various other extensions are also run such as WikibaseLexeme, WikibaseQualityConstraints, and PropertySuggester.

Wikidata is accessed through a Varnish caching and load balancing layer provided by the WMF. Users, tools and any 3rd parties interact with Wikidata through this layer.

Off to the right are various other external services provided by the WMF. Hadoop, Hive, Ooozie and Spark make up part of the WMF analytics cluster for creating pageview datasets. Graphite and Grafana provide live monitoring. There are many other general WMF services that are not listed in the diagram.

Finally we have our semi persistent and persistent storages which are used directly by Mediawiki and Wikibase. These include Memcached and Redis for caching, SQL(mariadb) for primary meta data, Blazegraph for triples, Swift for files and ElasticSearch for search indexing.

Read more

Hacking vs Editing, Wikipedia & Declan Donnelly

On the 18th of November 2018 the Wikipedia article for Declan Donnelly was edited and vandalised. Vandalism isn’t new on Wikipedia, it happens to all sorts of articles throughout every day. A few minutes after the vandalism the change made its way to Twitter and from there on to some media outlets such as thesun.co.uk and  metro.co.uk the following day, with another headline scaremongering and misleading using the word “hack”.

“I’m A Celebrity fans hack Declan Donnelly by changing his height on Wikipedia after Holly Willoughby mocks him”

Hacking has nothing to do with it. One of the definitions of hacking is to “gain unauthorized access to data in a system or computer”. What actually happened is someone, somewhere, edited the article, which everyone is able and authorized  to do. Editing is a feature, and its the main action that happens on Wikipedia.

The word ‘hack’ used to mean something, and hackers were known for their technical brilliance and creativity. Now, literally anything is a hack — anything — to the point where the term is meaningless, and should be retired.


The word ‘hack’ is meaningless and should be retired – 15 June 2018 by MATTHEW HUGHES

Read more

freenode #live – Bristol 2018

freenode #live is a “community-focused live event designed to build and strengthen relationships between Free and Open Source Software (FOSS) developers and users”. The 2018 event was held in Bristol, United Kingdom at We the curious with roughly 100-200 people attending (from my guesswork).

The event essentially had a single track of talks. The old IMAX theatre above the Aquarium was used as an auditorium with various stalls for organizations set up outside. These stalls included KDE, Kiwi IRC, Private internet access and more.

Most of the talks were recorded and can be found on this YouTube playlist. Now for some of my main takeaways or points of note, most of which are IRC related, which might make sense as the conferences is called freenode #live…

Read more

Wikidata Map October 2018

It has been another 6 months since my last post in the Wikidata Map series. In that time Wikidata has gained 4 million items, 1 property with the globe-coordinate data type (coordinates of geographic centre) and 1 million items with coordinates [1]. Each Wikidata item with a coordinate is represented on the map with a single dim pixel. Below you can see the areas of change between this new map and the once generated in March. To see the equivalent change in the previous 4 months take a look at the previous post.

Comparison of March 26th and October 1st maps in 2018

Read more

Quickly clearing out your Facebook advert ‘interests’

2020 EDIT: This solution is now packaged up as a nice browser extension. Click here to read the new post for details and links to the browser extension.

Over the past years, Facebook have had a few privacy related issues. First came the ‘scandal’ with Cambridge Analytica and more recently a bug (or series of bugs) that apparently affected 50 million accounts allowing peoples access tokens to be stolen. Oh, and there was also the story about Facebook using your 2 factor authentication phone number for targeting advertising.

With all of the goings on recently, as well as the new GDPR (General Data Protection Regulation) in the EU, Facebook have attempted to make the data that they have about an individual easier to see, understand and edit or remove. One such section of this data covers your “ad preferences“.

Screenshot of Facebook ad preferences

Read more

Wikibase extensions on Wikidata.org

Wikidata.org runs on MediaWiki with the Wikibase extension. But there is more to it than just that. The Wikibase extension itself is split into 3 different sections, being Lib, Repo and Client. There are also 6 other extensions all providing extra functionality to the site and it’s sisters. The extensions are also loaded on a different combination of Clients (such a Wikipedia) and the Repo itself (wikidata.org).

A diagram of current dependencies between the various Wikibase extensions running on wikidata.org

Read more