2021 Year Review

This entry is part 5 of 7 in the series Year Reviews

I have been doing some sort of year review since 2017, covering projects I work on, this blog, Github and Twitter. You can find the past posts using the tag #year-review.

2021 has seen the continuation of COVID-19 with the Omnicron variant making its first appearance in the last few months. Currently living in the UK we currently have the highest reported numbers of this new variant, as well as high covid levels overall. Who knows what 2022 will bring.

Blogging

At the time of writing this, I have written 43 blog posts this year, likely to leave my end of year total at ~47. That’s double what I wrote in 2020. In December I attempted to write a blog post a day, turning out to be too much, but it looks like I should be able to hit a secondary target of a post every 2 days, so ~15 in December.

  • ~40,000 page views, down from 47,664 (~16% decrease, but trending up looking across more years)
  • ~30,000+ visitors, down from 32,197 (~7% decrease)
  • ~47 posts, beating my previous record of 25 in 2018

It’s really easy to see how a single post can skew growth year by year. That post last year was Quickly clearing out your Facebook advert ‘interests’ which alone brought in 10k views, this year reducing to 3k. The peak post of 2020 only has around 6k views.

I quite like looking at most viewed posts so I can get some sort of gauge on what I should perhaps be writing more about, or projects that generally interest people. Development on Windows is the theme of the first 2 posts in my top 10. The majority of the rest of the top 10 are short posts covering problems I have encountered and solved over the years. And, also lasagnes… (more on that below)

Read more

Wikidata user and project talk page connection graph

Talk pages are a pretty key part of how wikis have worked over the years. Realtime chat apps and services are probably changing this dynamic somewhat, but they are still used, and also most of the history of these pages is still recorded.

I started up an IPython Notebook to try and take a look at some of the connections between different users on Wikidata over the years. Below you’ll find a few representations of these connections, as well as notable things I spotted along the way, the generating code, SQL query and more!

The data

MediaWiki maintains links tables for all pages, so getting all of the current links out of Wikidata is very easy. I made use of the Wikimedia Cloud Quarry service to run this query and host a CSV of the results.

SELECT
  SUBSTRING_INDEX(page_title, '/', 1) AS t1,
  pl_from_namespace AS t1ns,
  SUBSTRING_INDEX(pl_title, '/', 1) AS t2,
  pl_namespace AS t2ns
FROM pagelinks, page
WHERE pl_namespace IN (3,5) AND pl_from_namespace IN (3,5)
AND page_id = pl_from AND page_title != pl_title
GROUP BY t1, t2Code language: PHP (php)

I then loaded this data directly into an IPython Notebook and did some cleaning, such as removing all IP addresses. I then spent quite some time applying more filtering and twiddling knobs to try and get some graphics out that are easy to look at. The first attempts looked like solid blobs as you can see in this tweet.

You can find a copy of the Notebook on notebooksharing.space.

Read more

Most liked Wikibase tweets

Wikidata is 9, and Wikibase the software that powers it is also thus about 9! Twitter has been around for the entire Wikibase lifespan. So let’s take a look back through time at some of the most liked Wikibase tweets (according to Twitter free search) since creation.

Want this list but for Wikidata? Check out my Wikidata focused post!

2021, @annechardo 113 💕s

My thesis on “Managing Archival Authority Data in the Data Web” is online! The first part (from #maintenance to #MetadataDebt via #RiC ) is completed by a case study using #Wikibase :

@annechardo, Twitter Translate
https://twitter.com/annechardo/status/1348650172268617731

Read more

Most liked Wikidata tweets

Wikidata is 9, and Twitter has been around for the entire Wikidata lifespan. So let’s take a look back through time at some of the most liked Wikidata tweets (according to Twitter free search) since creation.

Personally, I think it’s rather cool that half of the tweets are in languages other than English!

Want this list but for Wikibase (the software that runs Wikidata)? Check out my Wikibase focused post!

2021, @wikidata 412 💕s

Announcement of the new Wikidata Query Builder by @wikidata!

Read more

Finding the most liked tweets for a topic in a year

I’m nearly halfway through writing a month of daily blog posts. I wanted to write some posts covering the history of both Wikidata and Wikibase on Twitter. Being a developer, I looked for APIs, but it seems tweets are not as accessible as they once were.

This is a short write up of my adventure, covering APIs, scraping thoughts, and finally, my working solution, all be it with a quirk of 2 that I can’t explain.

Read more

A PHP library for jsonstorage.net

I first heard about jsonstorage.net when searching around for a quick place to persist some data while writing a Twitter bot.

jsonstorage.net provides a simple JSON storage, with a free tier that can be very useful for small personal projects. The REST API is super simple, GET POST PUT DELETE etc. You can have either public or private JSON objects.

During my first use of this service, I remember writing some random code in PHP to interact with the service and deal with authentication. I remember thinking it would be nice if there were a small library that I could grab off the shelf for this, and it would have saved me some minutes… Today is the day I write that simple library!

Read more

SMWCon 2021, Development environments using containers

SMWCon 2021 is happening as I write this post. I was invited to give a short talk as part of a MediaWiki and Docker workshop organized by Cindy Cicalese on day 2. As I am writing a month of blog posts I’m going to turn my slides into a more digestible and searchable online blog post.

The original slides can still be found on Google Slides, and when the conference recording is up you should find it on the associated event page.

Disclaimer

Read more

WBStack in 2021 and the future

This entry is part 9 of 12 in the series WBStack

2021 is nearly over, WBStack is over 2 years old (initially announced back in 2019), and has continued to grow. The future is bright with wikibase.cloud looking to be launched by Wikimedia Deutschland in the new year (announced at WikidataCon 2021), and as a result, the code under the surface has had the most eyes on it since its inception.

Let’s take a closer look at some of the developments this year, and the progress that WBStack has made.

Current Usage

WBStack now has 148 individual user accounts registered on the platform that enabled wiki creation. These accounts have created 510 wikis with Wikibase installed since the platform was initially put online, and 335 of those wikis are still currently published (the other 175 have been deleted).

Nov 2019April 2020May 2020Nov 2021Dec 2021
Platform Users387076139148
Non deleted Wikis145306335
All Wikis65178226476510
Pages1.4 million1.9 million
Edits200,000295,0004.1 million4.6 million

Read more

Tech Lead Digest – Q3/4 2021

This entry is part 5 of 5 in the series Tech Lead Digest (wmde)

It’s time for the 5th instalment of my tech lead digest posts. I switched to monthly for 2 months, but decided to back down to quarterlyish. You can find the other digests by checking out the series.

🧑‍🤝‍🧑Wikidata & Wikibase

The biggest event of note in the past months was WikidataCon 2021 which took place toward the end of October 2021. Spread over 3 days the event celebrated Wikidatas 9th birthday. We are still awaiting the report from the event to know how many folks participated, and recordings of talks will likely not be available until early 2022. At which point I’ll try to write another blog post.

Just before WikidataCon the updated strategy for Linked Open Data was published by Wikimedia Deutschland which includes sub-strategies for Wikidata and the Wikibase Ecosystem. This strategy is much easier to digest than the strategy papers published in 2019 and I highly recommend the read. Part of the Wikidata strategy talks about “sharing workload” which reminds me of some thoughts I recently had comparing Wikipedia and Wikidata editing. Wikibase has a focus on Ecosystem enablement, which I am looking forward to working on.

The Wikibase stakeholder group continues to grow and organize. A Twitter account (@wbstakeholders) now exists tweeting relevant updates. Now with over 14 organizational members and 15 individual members, the budget is now public and the group is working on getting some desired features implemented. If you are an organization or individual working in the Wikibase space, be sure to check them out! The group recently published a prioritized list of institutional requirements, and I’m happy to say that some parts of the “Automatic maintenance processes and updating cascades should work out of the box” area that scored 4 have already been tackled by the Wikidata / Wikibase teams.

Read more