It's a blog

Category: Posts (Page 1 of 15)

All Posts

What happens in Wikibase when you make a new Item?

A recent Wikibase email list post on the topic of Wikibase and bulk imports caused me to write up a mostly human readable version of what happens, in what order, and when, for Wikibase action API edits, for the specific case of item creation.

There are a fair few areas that could be improved and optimized for a bulk import use case in the existing APIs and code. Some of which are actively being worked on today (T285987). Some of which are on the roadmap, such as the new REST APIs for Wikibase. And others which are out there, waiting to be considered.

This post is is written looking at Wikibase and MediaWiki 1.36 with links to Github for code references. Same areas may be glossed over or even slightly inaccurate, so take everything here with a pinch of salt.

Reach out to me on Twitter if you have questions or fancy another deep dive.

Continue reading

A first look at Wikidata through Github Copilot

I was added to the Github Copilot preview in the past days, and the first thing I tested out was code suggestions around Wikidata.

Copilot allows you to start writing code, and have a model trained on public Github code suggest block of code that you might want to use.

For example, if you start defining a method called onePlusOne then it might suggest the body return 1+1;.

function onePlusOne() { return 1 + 1; }
Code language: JavaScript (javascript)

More targeted toward Wikidata, if you start defining a function called whatIsWikidata then it might suggest return "Wikidata";

function whatIsWikidata(){ return "Wikidata"; }
Code language: JavaScript (javascript)

In some cases copilot suggests multiple code blocks. Some of these seem useful, others less so.

Continue reading

Tackling Technical Debt, big and small, in Wikidata and Wikibase

If you’re working with legacy code, chances are you’ve inherited some technical debt. Infact, if you’re working with code, chances you’re already surrounded by technical debt of varying sizes, at least by some measures.

Some believe that technical debt is something to be avoided, and that technical debt that exists is a dirty secret that should be hidden. The reality is that technical debt is a fact of life when code iteratively changes to deliver product solutions.

Striving for programming perfection is great in principle, but ultimately code is meant to deliver features, and there is always a good, better and best approach, with many other variations in-between.

Over the last year at Wikimedia Deutschland we have worked on refining how we record, triage, prioritize and tackle technical debt within the Wikidata and Wikibase product family.

There are many thoughts out there about how to track, tackle, and prioritize technical debt. This post is meant to represent the current status of the Wikidata / Wikibase team. Hopefully you find this useful.

Continue reading

Tech Lead Digest – Q2 2021

This is the second installment of my tech lead digest digest with my tech lead hat on for the Wikidata Wikibase team.

This is a digest of my internal digest emails. These contain lots of links to reading, podcasts and general goings on that could be useful to a wider audience.

🧑‍🤝‍🧑Wikidata & Wikibase

Continue reading

Tech Lead Digest – Q1 2021

At some point last year I started sending a weekly internal digest to the Wikidata Wikibase team with my tech lead hat on.

The emails are internal only but contain lots of links to reading, podcasts and general goings on that could be useful to everyone.

So here is my first Wikidata Wikibase tech lead digest digest!

🧑‍🤝‍🧑Wikidata & Wikibase

Continue reading

mediawiki-docker-dev, a history

MediaWiki-Docker-Dev (or MWDD) is a development environment for MediaWiki, based on Docker and docker-compose. It was created back in 2017 at the Wikimedia Hackathon in Vienna where it had a slightly difference feature set and focus. (Original Slides).

Since inception the git repo now has 180 commits from 20 authors over the course of 4 years, of which 7 have been WMF employees and 11 have been WMDE employees, though the project has had no “official” support from either organization. Counting forks we have 12 WMF employees and 16 WMDE employees.

Due to the nature of the project (being setup from a git clone), it is quite hard to figure out how many users it has. We can infer that in the last year, thanks to a custom image that has been required, it has been set up roughly 1200 times, by checking the pull stats of silvanwmde/nginx-proxy.

Continue reading

Github repo settings sync, using the Github cli

The number of Github repositories that I end up maintaining in one way or another ends up growing week by week. And keeping all of the descriptions and settings up to date in sync can be painful todo by hand.

A little while ago I migrated my addwiki project to use a monorepo, and thus needed to bulk update all of the github repository descriptions. While doing so I made use of the github cli and created a single bash script to let me configure all of the repositories at once.

Assuming you already have the github cli install and configured getting started with this is easy.

The command

The below command is one of many in my bash script for repo configuration. This sets a description, homepage and various other flags that I want to be consistent across repositories.

gh api --method PATCH repos/addwiki/addwiki \ --field description='Monorepo containing all addwiki libraries, packages and applications'\ --field homepage=''\ --field has_issues='true'\ --field has_projects='false'\ --field has_wiki='false'
Code language: JavaScript (javascript)
Continue reading

Resizing a qemu image root disk partition

Recently I found myself altering some virtual images for loading onto a qemu machine. I wanted to increase the disk space on the root partition, but couldn’t find any straightforward guides. So here is a little guide for future me, and anyone else.

Install libguestfs-tools

libguestfs is a set of tools for accessing and modifying virtual machine (VM) disk images. You can find details in the docs.

apt-get install libguestfs-tools
Code language: JavaScript (javascript)

Create a new resized image

First you need to create a new empty image of the size that you want.

For me that is 20GB

truncate -s 20G ./out.img

Then use virt-resize to expand the existing disk to fill all of the space in the new image that we created. (virt-resize docs)

virt-resize --expand /dev/sda1 ./vm.img ./out.img

Verify it worked

libguestfs-tools also provides a way to view file system information by only using the image file. (virt-filesystems docs)

virt-filesystems --long --parts --blkdevs -h -a ./out.img

You’ll see something like this:

Name Type MBR Size Parent /dev/sda1 partition 83 20G /dev/sda /dev/sda device - 20G -
Code language: PHP (php)

WBStack setting changes, Federated properties, Wikidata entity mapping & more

During the first 3 months of 2021, some Wikimedia Deutschland engineers, from the Wikidata / Wikibase team, spent some time working on WBStack as part of an effort to explore the WBaaS (Wikibase as a service) topic during the year, as outlined by the development plan.

We want to make it easier for non-Wikimedia projects to set up Wikibase for the first time and to evaluate the viability of Wikibase as a Service.

Wikibase 2021 Development plan

This has lead to a few new Wikibase features being exposed through the WBStack dashboard for sites that run on the platform. These features are primarily features developed by the Wikibase team in 2020 and 2021. The work also brought some other quality of life improvements for the settings pages.

Here is a quick rundown of what’s new and improved.

Continue reading

WBStack Infrastructure

WBStack is a platform allowing shared scalable hosting of Wikibase and surrounding services.

A year ago I made an initial post covering the state of WBStack infrastructure. Since then some things have changed, and I have also had more time to create a clear diagram. So it is time for the 2021 edition.

WBStack currently runs on a single Google Cloud Kubernetes Engine cluster, now made up of 3 virtual machines, one e2-standard-2 and two e2-medium. This results in roughly 4 vCPUs and 12GB of allocatable memory.

Continue reading
« Older posts

© 2021 Addshore

Theme by Anders NorénUp ↑