It’s been another 9 months since my last blog post covering the Wikidata generated geo location maps that I have been tending to for a few years now. Writing this from a hammock, lets see what has noticeably changed in the last 9 months using a visual diff and my pretty reasonable eyes.
In 2016 I wrote a blog post with this exact title when moving all of my pictures from Facebook to Google photos. I wrote a hacky little script which met my needs and added exif data from a HTML Facebook data dump back to the images that came along with it.
I recently updated updated the Wikibase registry from Mediawiki version 1.30 to 1.31 and described the process in a recent post, so if you want to see what the current setup and docker-compose file looks like, head there.
As a summary the Wikibase Registry uses:
The wikibase/wikibase:1.31-bundle image from docker hub
The installation creation process is documented in this blog post, and some customization regarding LocalSettings and extensions was covered here. The current state of the docker-compose file can be seen below with private details redacted.
This docker-compose files is found in /root/wikibase-registry on the server hosting the installation. (Yes I know that’s a dumb place, but that’s not the point of this post)
I recently encountered this error while trying to run one of my docker setups.
ERROR:formediawiki-docker-dev_db-slave_1 Cannot start service db-slave:b'OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"/tmp/mwdd/entrypoint.sh\": stat /tmp/mwdd/entrypoint.sh: no such file or directory": unknown'
I have encountered errors like this before and it has always ended up being related to docker and sharing my drives to the linux VM that actually runs my containers.
Checking the shared drives menu of the docker UI everything seemed to be fine.
However when removing the drive share and re sharing the drive I got an error message saying that there was a “Firewall detected” and that “A firewall is blocking file Sharing between Windows and the containers. See documentation for more info”.
Over the years diagrams have appeared in a variety of forms covering various areas of the architecture of Wikidata. Now, as the current tech lead for Wikidata it is my turn.
Wikidata has slowly become a more and more complex system, including multiple extensions, services and storage backends. Those of us that work with it on a day to day basis have a pretty good idea of the full system, but it can be challenging for others to get up to speed. Hence, diagrams!
All diagrams can currently be found on Wikimedia Commons using this search, and are released under CC-BY-SA 4.0. The layout of the diagrams with extra whitespace is intended to allow easy comparison of diagrams that feature the same elements.
Wikidata is accessed through a Varnish caching and load balancing layer provided by the WMF. Users, tools and any 3rd parties interact with Wikidata through this layer.
Off to the right are various other external services provided by the WMF. Hadoop, Hive, Ooozie and Spark make up part of the WMF analytics cluster for creating pageview datasets. Graphite and Grafana provide live monitoring. There are many other general WMF services that are not listed in the diagram.
Finally we have our semi persistent and persistent storages which are used directly by Mediawiki and Wikibase. These include Memcached and Redis for caching, SQL(mariadb) for primary meta data, Blazegraph for triples, Swift for files and ElasticSearch for search indexing.
Hacking has nothing to do with it. One of the definitions of hacking is to “gain unauthorized access to data in a system or computer”. What actually happened is someone, somewhere, edited the article, which everyone is able and authorized to do. Editing is a feature, and its the main action that happens on Wikipedia.
The word ‘hack’ used to mean something, and hackers were known for their technical brilliance and creativity. Now, literally anything is a hack — anything — to the point where the term is meaningless, and should be retired.
The event essentially had a single track of talks. The old IMAX theatre above the Aquarium was used as an auditorium with various stalls for organizations set up outside. These stalls included KDE, Kiwi IRC, Private internet access and more.
Most of the talks were recorded and can be found on this YouTube playlist. Now for some of my main takeaways or points of note, most of which are IRC related, which might make sense as the conferences is called freenode #live…
It has been another 6 months since my last post in the Wikidata Map series. In that time Wikidata has gained 4 million items, 1 property with the globe-coordinate data type (coordinates of geographic centre) and 1 million items with coordinates . Each Wikidata item with a coordinate is represented on the map with a single dim pixel. Below you can see the areas of change between this new map and the once generated in March. To see the equivalent change in the previous 4 months take a look at the previous post.