Editing Grokipedia, a first look

As a long time editor and developer in the Wikipedia and Wikimedia space, I’m of course sceptical about what Grokipedia is trying to be, and if it stands any chance of success. it may struggle to deliver the resilience, transparency, and community processes that keep projects like Wikipedia thriving, and in the early weeks the untouchable AI generated content was certainly not going to work moving forward.

However, in the last week or so editing became an option, hidden behind Grok as a safeguard against abuse.

I thought I’d have a look at trying to edit a few of areas of content to see what the experience is like, and capture some of the good and bad points.

In no particular order…

Broken link formatting

A fix attempt

The Donald Trump articles has some broken formatting, which looks like an incorrectly parsed or formatted Markdown link that is now just showing in the HTML of the page. For posterity, I captured a copy of this version of the page on archive.ph, but here is a snapshot of how it appears.

Read more

Slop in, craft out?

Earlier today, I sent this absolutely perfectly crafted piece of slop into GitHub Copilot…

Right, but i want thje patche sot be / and /* always

And as I already expected, due to using these LLM based coding agents and assistants continually throughout their evolution, the resulting change was exactly what I wanted, despite the poor instructions.

Now, I’m sure there is actually some difference, and likely this depends on the relevance of the typoed areas, and how often such typos might also appear in training data.

Why is this, you might ask?

Read more

AI Code assistant experience comparison (golang-kata-1)

If you’re reading this, and thinking about trying an IDE integrated coding agent, or thinking about switching, maybe stick around, have a read and watch some of the videos. There is at least 6 hours worth of experience wrapped up in this 20 minuite read!

I’m watching a thread on the GitHub community forums, where people are discussing how GitHub Copilot has potentially gone slightly downhill. And in some ways I agree, so I through I’d spend a little bit more time looking at the alternatives, and how they behave.

This post tries to compare 9 different setups, and will primarily look at the differences in presentation within the VS Code IDE that each of these different coding assistants have. How the default user interactions work, and how the tasks are broken down and presented to the user, and generally what the user experience is like between these different assistants.

I’ll try to flag up some other useful information along the way, such as time comparisons, amount of human interaction needed, and overall satisfaction with what the thing is doing, and if this all presents itself nicely in this post, I might find myself writing more in the future…

However, I will not be looking at cost, setup, resource usage or what’s happening with my data along the way…

Assistant, LLM combinations

AssistantModelMain tasks @Tests @Second app @
Github CopilotGPT 4o~ 5:00~ 24:45~ 32
Github CopilotGPT 4.1~ 15:00~ 17:40~ 35
Github CopilotClaude Sonnet 4~ 17:00 (inc tests)~ 17:00~ 28
Gemini Code AssistantGemini Something ?~ 11:20~ 14:30~ 25
AmazonQClaude Sonnet 4~ 7:20~ 15:50~ 28
RoocodeGPT 4.1 (via Github Copilot)~ 5:30~ 10:00~ 18
RoocodeClaude Sonnet 4 (via Anthropic)~ 15:30~ 20:00~ 37
Claude CodeClaude Sonnet 4~ 9:30~ 17:40~ 24
Claude CodeClaude Opus 4~ 10:00N/AN/A

I have setup this post, and the code problem in such a way that I should be able to easily add more combinations and comparisons in the future, and directly compare the performance back to this post. Ideally, at some stage I’d try some other models via Ollama, and also some other pay per requests LLM APIs…

Read more