Addshore

It's a blog

Month: September 2015

Downfall of Orain

So, from where I am sat right now, it looks like Orain is dead. It could just be me but Orain has been struggling with issues for a while now and the events that took place last week were basically the final nail in the coffin.

As final nails in coffins go, I don’t see this nail being removed any time soon.

Some brief history and facts

Orain was founded by Dusti and Kudu in July 2013, which means it is currently just over 2 years old. Orain has no paid ‘staff’ but instead is kept on-line by a small team of volunteers. The number of volunteers trying to keep the sites up at any given time has varied although in the past month of so that number dropped to 2. During the 2 years the Orain services have been on a variety of different hosts, including AWS, Ramnode and most recently DigitalOcean.

Last week ( 16 Sept 2015 )

Firstly I will say that I still do not know exactly what happened, or how, but it must have been one of the following things:

  • Someone did something stupid with a password. This could have been accidentally posting it somewhere, sharing it with someone or not keeping it in a secure location.
  • Someone with access to the accounts@orain.org email forwarder had their email address compromised.
  • Someone on the inside decided that it was time for Orain to die…

There are other options but frankly the likely hood of those compared with the list above is slim.

Rough Timeline (UTC)

  • 01:44 the CloudFlare password was reset (we have a an IP address relating to the reset of this password).
  • At some point the mail DNS records for orain.org were changed, pointing to an external server (not in orain control).
  • At some point the password for DigitalOcean was reset, made easy by the fact that this person had control of the email accounts.
  • At some point 1 in 2 requests were redirected to a questionable device. You can find an image of the change that was made here.
  • 09:20 I woke up to see Orain in a mess and Informed Dusti and others by email while trying to see what on earth happened.
  • 16:00 Confirmed that the ATT databases was no longer on the server. A screenshot can be seen here.
  • Also confirmed that someone had root access to the servers using the DigitalOcean panel, screenshot can be seen here. (It should be noted this shows the root user as idle for 9 hours at 16:00 UTC, meaning at least for prod5 the user was active last at roughly 07:00.
  • At some point in the afternoon / evening all machines were powered down.

What I can say with 100% certainty.

  • I have backups from 15th June 2015 @ 18:00 UTC for all wikis that existed at that time and I am more than happy to give these to people.
  • EDIT: Backups from August 2015 are available on archive.org
  • The ATT database was deleted, but I was not able to SSH to the primary database server so those databases may not have been deleted.
  • As the user had root on all servers via the DigitalOcean control panel it should be assumed that ALL data was / could have been compromised. This includes usernames, email addresses, names and hashed & salted passwords. This also includes access logs meaning IP addresses, user agents and request data which could all be tied to users.
  • I do not have any backups of the uploads, although these had not been deleted before the machines were powered down.
  • Right now I have no idea if the machines were simply powered down or deleted (they are only VPSs after all)
  • At this time I believe Dusti is trying to gain access back to the DigitalOcean and Cloudflare accounts, until this happens it’s hard to really say or do anything more.

Possible conclusions to all of this

  1. Orain gets access back to DO, the servers are still there, it is powered up and the dbs & uploads are still there
  2. Orain gets access back to DO, the servers are still there, it is powered up and a mixture of dbs and uploads are still there.
  3. Orain gets access back to DO, the servers are still there, it is powered up and all the dbs are gone & the uploads are gone.
  4. Orain gets access back to DO, the servers are gone…
  5. Orain does not get access back to DO…

EDIT (well, option 6 here happened.)

Finally

I am happy to answer any questions I can, although basically everything I can say is written above.

As I previously said I would have expected the founders of Orain to inform the users of Orain of the events, but apparently they haven’t found the time to, or don’t want to, or a mixture. I hope that they will soon.

Personally I want to make try to help everyone that did have a wiki with Orain, I have the backups and am of course willing to give them to the wiki owners so that they can move to new hosting.

  • https://meta.miraheze.org/wiki/Miraheze
  • http://www.shoutwiki.com/wiki/Main_Page
  • etc….

Other Orain posts

I have a few other posts about Orain, you can find them below.

Un-deleting 500,000 Wikidata items

Since some time in January of this year I have been on a mission to un-delete all Wikidata items that were merged into other items before the redirect functionality of Wikidata existed. Finally I am done (well nearly). This is the short story…

Reasoning

Earlier this year I pointed out the importance of redirects on Wikidata in a blog post. At the time I was amazed at how the community nearly said that they were not going to create redirects for merged items…. but thank the higher powers that the discussion just swung in favour of redirects.

Redirects are needed to maintain the persistent identifiers that Wikidata has. When two items relate to the same concept, they are merged and one of the identifiers must then be left pointing to the identifier now holding the data of the concept.

Listing approach

Since Wikidata began there have been around 1,000,000 log entries deleting pages, which equates to roughly the same number of items deleted, although some deleted items may also have been restored. This was a great starting point. The basic query to get this result was can be found below.

I removed quite a few items from this initial list by looking at at items that had already been restored and were already redirects. To do this I had to find all of the redirects!

At this stage I could have probably tried and remove more items depending on if they currently exist, but there was very little point. In fact it turned out that there was very little point in the above query as prior to my run very few items were un-deleted in order to create redirects.

The next step was to determine which of the logged deletions were actually due to the item being merged into another item. This is fairly easy as most cases of merges used the merge gadget on Wikidata.org. So if the summary matched the following regular expression! I would therefore assume it was deleted due to being merged / a duplicate of another item.

And of course in order to create a redirect I would have to be able to identify a target, so, match Q id links.

I then had a fairly  nice list, although it was still large, but it was time to actually start trying to create these redirects!

Editing approach

So firstly I should point out that such a task is only possible while using an Admin account, as you need to be able to see deleted revisions / un-delete items. Secondly it is not possible to create a redirect over a deleted item and also not possible to restore an item when that would create a conflict on the site, for example due to duplicate site links on items or duplicate joined labels and descriptions.

I split the list up into 104 different sections, each containing exactly 10,000 item IDs. I could then fire up multiple processes to try and create these redirects to make the task go as quickly as possible.

The process of touching a single ID was:

  1. Make sure that the target of the merge exists. If it does not then log to a file, if it does, continue.
  2. Try to un-delete the item. If the deletion fails log to a file, if it is successful continue.
  3. Try to clear the item (as you can only create redirects over empty items). This either results in an edit or no edit, it doesn’t really matter.
  4. Try to create the redirect, this should never fail! If it does log to a fail file that I can clean up after.

The approach on the whole worked very well. As far as I know there were no incorrect un-deletions and nothing failing in the middle.

The first of 2 snags that I hit was the rate at which I was trying to edit was causing the dispatch lag on wikidata to increase. There was no real solution to this other than to keep an eye on the lag and if it ever increased above a certain level to stop editing.

The second snag was causing multiple database locks during the final day of running, although again this was not really a snag as all the transactions recovered. The deadlocks can be seen in the graph below:

The result

  • 500,000 more item IDs now point to the correct locations.
  • We have an accurate idea of how many items have actually been deleted due to not being notable / being test items.
  • The reasoning for redirects has been reinforced in the community.

Final note

One of the steps in the editing approach was to attempt to un-elete an item and if un-deleting were to fail to log the item ID to a log file.

As a result I have now identified a list of roughly 6000 items that should be redirects but and not currently be un-deleted in order to be created.

See https://phabricator.wikimedia.org/T71166

It looks like there is still a bit of work to be done!

Again, sorry for the lack of images :/

Wikimedia Grafana graphs of Wikidata profiling information

I recently discovered the Wikimedia Grafana instance. After poking it for a little while here are some slightly interesting graphs that I managed to extract.

Continue reading

© 2017 Addshore

Theme by Anders NorenUp ↑