Addshore

It's a blog

Tag: Wikidata (page 1 of 4)

Wikidata is 6

It’s was Wikidata’s 6th birthday on the 30th of October 2018. WMUK celebrated this with a meetup on the 7th of November. They also made this great post event video.

Video from WMUK hosted Wikidata birthday event
Continue reading

Wikidata Map October 2018

It has been another 6 months since my last post in the Wikidata Map series. In that time Wikidata has gained 4 million items, 1 property with the globe-coordinate data type (coordinates of geographic centre) and 1 million items with coordinates [1]. Each Wikidata item with a coordinate is represented on the map with a single dim pixel. Below you can see the areas of change between this new map and the once generated in March. To see the equivalent change in the previous 4 months take a look at the previous post.

Comparison of March 26th and October 1st maps in 2018
Continue reading

Wikibase extensions on Wikidata.org

Wikidata.org runs on MediaWiki with the Wikibase extension. But there is more to it than just that. The Wikibase extension itself is split into 3 different sections, being Lib, Repo and Client. There are also 6 other extensions all providing extra functionality to the site and it’s sisters. The extensions are also loaded on a different combination of Clients (such a Wikipedia) and the Repo itself (wikidata.org).

A diagram of current dependencies between the various Wikibase extensions running on wikidata.org
Continue reading

Grafana, Graphite and maxDataPoints confusion for totals

The title is a little wordy, but I hope you get the gist. I just spent 10 minutes staring at some data on a Grafana dashboard, comparing it with some other data, and finding the numbers didn’t add up. Here is the story in case it catches you out.

The dashboard

The dashboard in question is the Wikidata Edits dashboard hosted on the Wikimedia Grafana instance that is public for all to see. The top of the dashboard features a panel that shows the total number of edits on Wikidata in the past 7 days. The rest of the dashboard breaks these edits down further, including another general edits panel on the left of the second row. 

Continue reading

Using Hue & Hive to quickly determine Wikidata API maxlag usage

Hue, or Hadoop User Experience is described by its documentation pages as “a Web application that enables you to easily interact with an Hadoop cluster”.

The Wikimedia Foundation has a Hue frontend for their Hadoop cluster, which contains various datasets including web requests, API usage and the MediaWiki edit history for all hosted sites. The install can be accessed at https://hue.wikimedia.org/ using Wikimedia LDAP for authentication.

Once logged in Hue can be used to write Hive queries with syntax highlighting, auto suggestions and formatting, as well as allowing users to save queries with names and descriptions, run queries from the browser and watch hadoop job execution state.

The Wikidata & maxlag bit

MediaWiki has a maxlag API parameter that can be passed alongside API requests in order to cause errors / stop writes from happening when the DB servers are lagging behind the master. Within MediaWiki this lag can also be raised when the JobQueue is very full. Recently Wikibase introduced the ability to raise this lag when the Dispatching of changes to client projects is also lagged behind. In order to see how effective this will be, we can take a look at previous API calls.

Continue reading

The Wikimedia Server Admin Logs

The Wikimedia Server Admin Log or SAL for short is a timestamped log of actions performed on the Wikimedia cluster by users such as roots and deployers. The log is stored on the WikiTech Wikimedia project and can be found at the following URL: https://wikitech.wikimedia.org/wiki/Server_Admin_Log

An example entry in the log could be:

As well as the main cluster SAL there are also logs for release engineering (jenkins, zuul, and other CI things) and individual logs for each project that uses Wikimedia Cloud VPS.

A tool has been created for easy SAL navigation which can be found at https://tools.wmflabs.org/sal

Each SAL can be selected at the top of the tool, with ‘Other’ providing you with a list of all Cloud VPS SALs.

The search and date filters can then be used to find entries throughout history.

Continue reading

Wikibase of Wikibases

The Wikibase registry was one of the outcomes of the first in a series of Federated wikibase workshops organised in partnership with the European research council.

The aim of the registry is to act as a central point for details of public Wikibase installs hosted around the web. Data held about the installs currently includes the URL for the home page, Query frontend URL and SPARQL API endpoint URL (if a query service exists).

During the workshop an initial data set was added, and this can be easily seen using the timeline view of the query service and a query that is explained within this post.

Continue reading

Wikidata Map March 2018

It’s time for the first 2018 installation of the Wikidata Map. It has been roughly 4 months since the last post, which compared July 2017 to November 2017. Here we will compare November 2017 to March 2018. For anyone new to this series of posts you can check back at the progression of these maps by looking at the posts on the series page.

Each Wikidata Item with a Coordinate Location(P625)will have a single pixel dot. The more Items present, the more pixel dots and the more the map will glow in that area. The pixel dots are plotted on a totally black canvas, so any land mass outline simply comes from the mass of dots. You can find the raw data for these maps and all historical maps on Wikimedia Tool Labs.

Looking at the two maps below (the more recent map being on the right) it is hard to see the differences by eye, which is why I’ll use ImageMagik to generate a comparison image. Previous comparisons have used Resemble.js.

Continue reading

Wikidata Map November 2017

It has only been 4 months since my last Wikidata map update post, but the difference on the map in these 4 months is much greater than the diff shown in my last post covering 9 months. The whole map is covered with pink (additions to the map). The main areas include Norway, Germany, Malaysia, South Korea, Vietnam and New Zealand to name just a few.

Continue reading

Wikibase docker images

This is a belated post about the Wikibase docker images that I recently created for the Wikidata 5th birthday. You can find the various images on docker hub and matching Dockerfiles on github. These images combined allow you to quickly create docker containers for Wikibase backed by MySQL and with a SPARQL query service running alongside updating live from the Wikibase install.

A setup was demoed at the first Wikidatacon event in Berlin on the 29th of October 2017 and can be seen at roughly 41:10 in the demo of presents video which can be seen below.

Continue reading

Older posts

© 2018 Addshore

Theme by Anders NorenUp ↑