Tech Lead Digest – Q2 2021
This is the second installment of my tech lead digest digest with my tech lead hat on for the Wikidata Wikibase team. This is a digest of my internal digest emails. These contain lots of links to reading, podcasts and general goings on that could be useful to a wider audience. 🧑🤝🧑Wikidata & Wikibase Federated…
Tech Lead Digest – Q1 2021
At some point last year I started sending a weekly internal digest to the Wikidata Wikibase team with my tech lead hat on. The emails are internal only but contain lots of links to reading, podcasts and general goings on that could be useful to everyone. So here is my first Wikidata Wikibase tech lead…
WBStack setting changes, Federated properties, Wikidata entity mapping & more
During the first 3 months of 2021, some Wikimedia Deutschland engineers, from the Wikidata / Wikibase team, spent some time working on WBStack as part of an effort to explore the WBaaS (Wikibase as a service) topic during the year, as outlined by the development plan. We want to make it easier for non-Wikimedia projects…
Twitter bot powered by Github Actions (WikidataMeter)
Recently 2 new Twitter bots appeared in my feed, fullyjabbed & fullyjabbedUK, created by iamdanw and powered entirely by Github Actions (code). I have been thinking about writing a Twitter bot for some time and decided to copy this pattern running a cron based Twitter bot on Github Actions, with an added bit of free…
Testing WDQS Blazegraph data load performance
Toward the end of 2020 I spent some time blackbox testing data load times for WDQS and Blazegraph to try and find out which possible setting tweaks might make things faster. I didn’t come to any major conclusions as part of this effort but will write up the approach and data nonetheless incase it is…
Faster munging for the Wikidata Query Service using Hadoop
The Wikidata query service is a public SPARQL endpoint for querying all of the data contained within Wikidata. In a previous blog post I walked through how to set up a complete copy of this query service. One of the steps in this process is the munge step. This performs some pre-processing on the RDF…
How can I get data on all the dams in the world? Use Wikidata
During my first week at Newspeak house while explaining Wikidata and Wikibase to some folks on the terrace the topic of Dams came up while discussing an old project that someone had worked on. Back in the day collecting information about Dams would have been quite an effort, compiling a bunch of different data from…
Creating new Wikidata items with OpenRefine and Quickstatements
Following on from my blog post using OpenRefine for the first time, I continued my journey to fill Wikidata with all of the Tors on Dartmoor. This post assumes you already have some knowledge of Wikidata, Quickstatements, and have OpenRefine setup. Note: If you are having problems with the reconciliation service it might be worth…
Using OpenRefine with Wikidata for the first time
I have long known about OpenRefine (previously Google Refine) which is a tool for working with data, manipulating and cleaning it. As of version 3.0 (May 2018), OpenRefine included a Wikidata extension, allowing for extra reconciliation and also editing of Wikidata directly (as far as I understand it). You can find some documentation on this…
Wikidata Map May – November 2019
It’s time for another blog post in my Wikidata map series, this time comparing the item maps that were generated on the 13th May 2019 and 11th November 2019 (roughly 6 months). I’ll again be using Resemble.js to generate a difference image highlighting changed areas in pink, and breakdown the areas that have had the…