Welcome to the 4th instalment of my Wikidata & Wikibase Tech lead Digest for August 2021. For previous instalments see Q1, Q2 & July.
🧑🤝🧑Wikidata & Wikibase
The Wikidata Query Builderhas been deployed. The Wikidata Query Builder provides a visual interface for building a simple Wikidata query. It is ideal for users with little or no experience in SPARQL.
The Wikibase fall release, which will be compatible with MediaWiki 1.36 will be made in the next month or so. At some point in the next 3-6 months we will likely also make a Wikibase 1.37 release. Keep an eye out on the mailing list for these.
Work is about to wrap up on the next iteration of Wikibase Federated Properties which will enable the use of properties from multiple sources at once, such as Wikidata and also the local Wikibase.
The campsite worked on many other things. Most notably Ladsgroup spotted that SpamBlacklist was rendering content on Wikidata twice (phabricator). This fix resulted in a rather significant improvement in save times for Wikidata, and users of Wikibase and SpamBlacklist in combination.
Copying data between Phabricator and Google Sheets constantly would be a complete pain, especially as new tasks get added to the sheet every day and details of tasks can also change on Phabricator itself, such as titles, and statuses.
Which is where Google Apps Script for Google Sheets and the Phabricator API come in to automate this part of the process, at least in one direction.
Welcome to the third installment of my tech lead digest digest. In order to allow myself some extra space to write, and also to provide these public updates and thoughts on a more regular basis, this is now becoming a monthly digest.
I’m going to try to incorporate some of the ongoings from other Wikidata / Wikibase projects, as well as my regular digest and reading.
🧑🤝🧑Wikidata & Wikibase
Work continues on the next iteration of Wikibase Federated Properties (phabricator board). This work will allow use of properties from multiple sources at once, such as Wikidata and also the local Wikibase.
Work also continues on the Wikidata Mismatch Finder (phabricator board) which is a tool to enable finding mismatches between Wikidata’s data and data in other databases.
The Campsite continues to work on a variety of smaller tasks, in the last month including a new release of our Design System, dealing with the Query Builder security review and preparing for deployment, performing some maintenance on WBStack including preparing for 1.36 and adding Elasticsearch. We also continue to support a university team in deploying a new Property Suggester algorithm (announcement coming soon), work towards tagging all edits made from the UI (T236893), as well as many other smaller tasks.
A recent Wikibase email list post on the topic of Wikibase and bulk imports caused me to write up a mostly human readable version of what happens, in what order, and when, for Wikibase action API edits, for the specific case of item creation.
There are a fair few areas that could be improved and optimized for a bulk import use case in the existing APIs and code. Some of which are actively being worked on today (T285987). Some of which are on the roadmap, such as the new REST APIs for Wikibase. And others which are out there, waiting to be considered.
This post is is written looking at Wikibase and MediaWiki 1.36 with links to Github for code references. Same areas may be glossed over or even slightly inaccurate, so take everything here with a pinch of salt.
Reach out to me on Twitter if you have questions or fancy another deep dive.
If you’re working with legacy code, chances are you’ve inherited some technical debt. Infact, if you’re working with code, chances you’re already surrounded by technical debt of varying sizes, at least by some measures.
Some believe that technical debt is something to be avoided, and that technical debt that exists is a dirty secret that should be hidden. The reality is that technical debt is a fact of life when code iteratively changes to deliver product solutions.
Striving for programming perfection is great in principle, but ultimately code is meant to deliver features, and there is always a good, better and best approach, with many other variations in-between.
Over the last year at Wikimedia Deutschland we have worked on refining how we record, triage, prioritize and tackle technical debt within the Wikidata and Wikibase product family.
There are many thoughts out there about how to track, tackle, and prioritize technical debt. This post is meant to represent the current status of the Wikidata / Wikibase team. Hopefully you find this useful.
Browser tests came up as a hot topic. A deep dive and some central analysis occurred seemingly correlation failures with “memory compaction” on VMs. This is out of our control, so we increased timeouts in some key areas.
MediaWiki-Docker-Dev (or MWDD) is a development environment for MediaWiki, based on Docker and docker-compose. It was created back in 2017 at the Wikimedia Hackathon in Vienna where it had a slightly difference feature set and focus. (Original Slides).
Since inception the git repo now has 180 commits from 20 authors over the course of 4 years, of which 7 have been WMF employees and 11 have been WMDE employees, though the project has had no “official” support from either organization. Counting forks we have 12 WMF employees and 16 WMDE employees.
Due to the nature of the project (being setup from a git clone), it is quite hard to figure out how many users it has. We can infer that in the last year, thanks to a custom image that has been required, it has been set up roughly 1200 times, by checking the pull stats of silvanwmde/nginx-proxy.