Asking Bing Chat AI to reference Wikidata

I previously compared Chat GPT and the Bing chat AI when it came to the question of “What is Wikibase Cloud”.

This comparison and further discussion highlighted problems with using Chat GPT alone. It seems to like inventing URLs that look right but have never existed, as it primarily wants to have a good conversation. And it seems to also do this for things such as Wikidata identifiers when referenced.

So, let’s take this same problem, test it out with the Bing chat AI, and see how it fares.

Firstly, how do we make the Bing chat AI actually reference Wikidata?

If we simply ask questions like “What is the capital of Germany?” or “Who is the prime minister of the UK?” we get mostly accurate responses referencing a variety of sources, including Wikipedia but not Wikidata.

Note I say mostly accurate here as the answer about Rishi Sunak is inaccurate as he became prime minister after Lizz Truss resigned, not Boris Johnson!

Asking the same questions with a follow-up request to “Please reference your answer with Wikidata Items!” seems to get us part of the way.

Read more

What is Wikibase Cloud (according to Bing AI)

A few days ago I wrote a post called What is Wikibase Cloud (according to ChatGPT). Now I also have access to the new Bing AI chatbot preview, so time to put it to the same test.

This is one of the first times I’m going to have purposefully used Bing since its launch!

As before, Messages to Bing AI will be in bold, things that are wrong, in red, and things that are right in green.

I’ll try to steer the conversation through the same questions that I used with ChatGPT.

Firstly, “What is Bing AI?”

Bing AI is a term that refers to various features and capabilities of Bing that are powered by artificial intelligence. For example, Bing AI can help you find answers, chat with you, generate content, and personalize your web experience²³.

Source: Conversation with Bing, 06/03/2023
(1) Reinventing search with a new AI-powered Microsoft Bing and Edge, your …. Accessed 06/03/2023.
(2) Factbox: How to get Microsoft’s new AI-enhanced Bing. Accessed 06/03/2023.
(3) Bing. Accessed 06/03/2023.
(4) Microsoft ignores Bing fails by planning to let AI loose on Windows 11 desktops. Accessed 06/03/2023.
(5) Bing AI chatbot now lets you fine-tune its personality. Accessed 06/03/2023.
(6) AI-powered Bing Chat gains three distinct personalities. Accessed 06/03/2023.

I’m really liking the fact that the responses come with references, whereas with ChatGPT you need to ask for the resources that were used to generate the answers. Also, it looks like the “Accessed” date for all of these resources is much more recent, perhaps this means Bing will be better at conversations about more recent ongoings?

This is all also nicely displayed in the UI.

So, “What is Wikibase Cloud?”

Wikibase Cloud is a cloud-based platform that hosts instances of Wikibase, an open-source software suite for creating collaborative knowledge bases¹². It’s currently an invite-only beta service provided by Wikimedia Deutschland¹⁴.

Are you interested in applying for early access to Wikibase Cloud?

Source: Conversation with Bing, 06/03/2023
(1) Wikibase/Wikibase.cloud – MediaWiki. Accessed 06/03/2023.
(2) Wikibase/Wikibase.cloud – Meta – Wikimedia. Accessed 06/03/2023.
(3) wikibase.cloud. Accessed 06/03/2023.
(4) What is Wikibase Cloud (according to ChatGPT) – addshore. Accessed 06/03/2023.

So, now we are apparently reaching the point of self-referential AI, as Bing is referencing the blog post that I wrote a few days ago pointing out where ChatGPT was incorrect about various points regarding Wikibase Cloud.

There is also an interesting prompt by Bing to continue the conversation asking if I want to apply for early access, so let’s give it a go with a simple “Yes”.

Read more

What is Wikibase Cloud (according to ChatGPT)

In this post, I’ll be asking ChatGPT a bunch of questions about Wikibase Cloud. mainly to see how accurate it manages to be, and how helpful the information is.

Messages to Chat GPT will be in bold, things that are wrong, in red, and things that are right in green.

Firstly, “What is Chat GPT?

But I needed to follow up, asking “Can you answer again but in the third person?

ChatGPT is a large language model developed by OpenAI that has been trained on a vast corpus of text data. The acronym GPT stands for “Generative Pretrained Transformer,” which refers to the model’s ability to generate natural language responses to text-based prompts. ChatGPT uses a transformer neural network architecture that enables it to process and comprehend long sequences of text. Its training data is sourced from a diverse range of materials, including books, articles, and websites, and covers a broad spectrum of topics. ChatGPT has been designed to assist users with natural language processing tasks such as answering questions, summarizing text, and generating human-like responses in conversation.

ChatGPT on 4 March 2023

But when asking about Wikibase Cloud it gets a little confused and answers incorrectly…

What is Wikibase Cloud?

Wikibase Cloud is a hosting service provided by Wikimedia Cloud Services that allows users to create and manage their own Wikibase instances in the cloud. …

ChatGPT on 4 March 2023
The full ChatGPT response for “What is Wikibase cloud”

Wikibase Cloud is a hosting service provided by Wikimedia Deutschland, so the reference to Wikimedia Cloud Services here is incorrect.

So naturally, I need to ask “Are you sure it is provided by Wikimedia Cloud Services?

Read more

2022 Year Review

This entry is part 6 of 7 in the series Year Reviews

I’ve been doing year reviews since 2017 under the #year-review tag, and 2022 is no different. Expect I have been living aboard a sailboat traveling the world for the latter half of the year. So this year is probably going to look a little different in retrospect, including far less time coding and writing about technology, but far more nautical miles traveled.

(the GPS track below is mostly accurate, but also has some odd artifact in it…)

Blogging (and Boating)

The trip has resulted in some alternative blogging about sailboats, and much of which has been on an entirely separate blog https://sailinghannahpenn.co.uk.

In fact, here is a picture of Hannah Penn from last week after hauling out of the water in Guadeloupe for form extra painting.

There is always lots to be doing while sailing, and second to sailing comes enjoyment. I have also been working 10h per week for Wikimedia Deutschland, and altogether this leaves sparse gaps for other things on the side like blogs.

Read more

Digital Yacht WL510 & 4G Connect review

It’s been around 6 months since I set off on a digital nomad-style experience on a sailboat with the Digital Yacht 4GConnect and WL510, and I’m ready to give them some kind of initial review.

If you want to read some high-level details of this adventure, and other technical details of the boat, batteries, antennas and work, read the digital nomad boat experience post first.

Overall, they are 2 nice bits of kit, well-engineered and thought-through, but probably a bit expensive given the amount of use that I have managed to get out of them (less than I would have liked for a variety of reasons).

TLDR; If I were to start this trip again, I probably wouldn’t buy them again. However, for a different trip or situation, they might make more sense (cruising around the UK for example)

Installation

A collection of antennas on the Mizen (back) mast

Both of the devices were easily wall mountable, and they come with all required cables and connectors, but I do wish that they both had switches included to turn them on and off, just in case you want to.

Due to the length of the antenna cables and the desire to put the antennas up our mizzen mast the positioning of the devices was not ideal, but at least we got the antennas up with a bit of height.

The Wifi antenna cable is also pretty thick, making usage of a rubber deck cable Gromit harder than we would have liked, as the cables also already have their ends attached and these are even thicker than the cables. (Of course, we could have made our own cables…)

Usage of the 4G Connect

The UI within the 4G connect is fine, fully featured and pretty easy to use. There isn’t much you need to do here if you are using the 4G Connect alone, but you’ll probably end up regularly changing things when using in conjunction with the WL510.

It’s nice having a large and high 4G antenna (part of the 4G Connect) to increase the chances of getting a 4G connection from your sim cards.

This has mostly been most useful for me while heading away from land or toward land at 5-20 nautical miles. I always get a connection on the 4G Connect before getting one on my phone and other devices. But in the grand scheme of things, this is not when most of my internet usage happens. It was a novelty being able to make a video call while in the middle of the sea, with the land only in distant sight, but not amazingly useful.

Read more

Sailing: Months 2 & 3

It’s been a while since my first personal sailing post, life is busy as ever, just now on a boat. Lots of sailing, exploring, relaxing, resting, working and boat jobs.

We have written lots of adventure content at sailinghannahpenn.co.uk, but now for my own summarised take on the past few months and general thinking of this boat life.

Exploration time

Looking back over the past 3 months I can now say for certain that you should spend more than 3 months adventuring down the western coast of Europe. In fact, just schedule 4x the amount of time you initially think for a sailing adventure. There are many gaps in our explorations and places I will need to try and see again.

Read more

Wikimedia Enterprise: A first look

Wikimedia Enterprise is a new (now 1-year-old) service and offered by the Wikimedia Foundation, via Wikimedia, LLC.

This is a wholly-owned LLC that provides opt-in services for third-party content reuse, delivered via API services.

In essence, this means that Wikimedia Enterprise is an optional product that third parties can choose to use that repackages data from within Wikimedia projects in a more useful, more reliable, and stable format presenting them primarily via data downloads and APIs, with profits going into the Wikimedia Foundation.

Want to find out more? Read the FAQ.

The project and APIs are well documented, and access can be requested for free, but I wanted to spend a little bit of time hands-on with the APIs to get a full understanding of what is offered, the formats, and how it differs from things I know are exposed elsewhere in Wikimedia projects.

Account Creation

Wikimedia Enterprise accounts are separate from any other Wikimedia related accounts, so you’ll need a new one.

In order to get an account you need to fill out a pretty straightforward form (username, password, email, and accept terms). You then need to verify your email address. Tada, you are in!

Read more

Jetbrains Fleet & WSL: First impressions

It’s no secret that I develop using Windows and WSL. For the past few years, I have also primarily used VSCode as my go-to development environment.

Between 2012 and 2018 I mainly used Jetbrains IntellijJ IDEA, but I found the speed of VSCode (launched in 2015), along with the modern design and vibrate plugin ecosystem, to win me over.

Every now and again I have found myself dipping back into the suite of Jetbrains IDEs, primarily for their high-quality code refactoring tools, nothing that I have seen in the VSCode ecosystem has quite lived up to this functionality.

Jetbrains Fleet was announced in 2021, and was behind a waitlist until this past week. It’s now in public preview!

This is exciting, as it’s advertised as “lightweight” with code processing engines running separately, similar to what is done in VSCode. But also contains their “20 years of experience developing IDEs”, which I hope will maintain the high-quality refactoring tools. Not to mention built-in “distributed” working modes for remote development, thus built-in WSL project integration.

So here is a very first look at using Fleet with a project in WSL2 land.

Read more

Global “unlimited” data options: Wraptel, Keepgo & Solis

Ahead of this year of sailing, I wanted to ensure I had as many high quality connectivity options as possible. As already mentioned I bought a Digital yacht WL510 and 4G Connect, the latter of which needs sim cards in order to connect.

Sim card slot inside the 4G Connect

The world seems to be half way between using physical sim cards, and ESIMS, and the digital yachting world hasn’t quite caught up with this trend just yet. So I wanted to find the best, hopfully mostly unlimited, mostly worldwide, sim card options to put in the 4G Connect.

All options that I managed to find had some sort capping system before data would slow down. Some even had an actual data cap despite advertising themselves as “unlimited”.

Below you can find a comparison of the options I tried.

Read more

A first Wikidata query service JNL file for public use

This entry is part 2 of 3 in the series Your own Wikidata Query Service

Back in 2019 I wrote a blog post called Your own Wikidata Query Service, with no limits which documented loading a Wikidata TTL dump into your own Blazegraph instance running within Google cloud, a near 2 week process.

I ended that post speculating that part 2 might be using a “pre-generated Blazegraph journal file to deploy a fully loaded Wikidata query service in a matter of minutes”. This post should take us a step close to that eventuality.

Wikidata Production

There are many production Wikidata query service instances all up to date with Wikidata and all of which are powered using open source code that anyone can use, making use of Blazegraph.

Per wikitech documentation there are currently at least 17 Wikidata query service backends:

  • public cluster, eqiad: wdqs1004, wdqs1005, wdqs1006, wdqs1007, wdqs1012, wdqs1013
  • public cluster, codfw: wdqs2001, wdqs2002, wdqs2003, wdqs2004, wdqs2007
  • internal cluster, eqiad: wdqs1003, wdqs1008, wdqs1011
  • internal cluster, codfw: wdqs2005, wdqs2006, wdqs2008

These servers all have hardware specs that look something like Dual Intel(R) Xeon(R) CPU E5-2620 v3 CPUs, 1.6TB raw raided space SSD, 128GB RAM.

When you run a query it may end up in any one of the backends powering the public clusters.

All of these servers also then have an up-to-date JNL file full of Wikidata data that anyone wanting to set up their own blazegraph instance with Wikidata data could use. This is currently 1.1TB.

So let’s try and get that out of the cluster for folks to use, rather than having people rebuild their own JNL files.

Read more