Browser extension showing blocked sites in searches

April 13, 2015 0 By addshore

Blocked.org.uk LogoSo last night I took another look at the ever-increasing list of domains that are blocked, due to various court orders, by various ISPs in the UK.

A first look

My first port of call was Wikipedia which has an article titled ‘List of websites blocked in the United Kingdom‘. Now this list, although referenced, doesn’t really contain all domains that are blocked. Luckily the article does include various other links.

From the Wikipedia article I then found the wiki of 451unavailable.org.uk which lists all current UK blocking orders. There is a wiki page for each blocking order, for example UK/temp plixid which lists 17 sites. Again each of these pages contains lots of links, and the main set of links here are to check which ISPs are currently blocking the given domain.

This brings us to blocked.org.uk.

Blocked.org.uk

Blocked is a free tool that has been generously funded by various sponsors and ORG supporters and that was built by ORG volunteers. This site contains information including:

  • The domain name in question
  • The ISP that the domain is blocked on
  • The date last blocked by the ISP
  • The date last checked if blocked

Judging by the ISP results page they also track the blocking on domains on a filter level basis for each domain. A screenshot of these statistics can be seen below, although something doesn’t look quite right as it appears that the Strict filtering has less sites blocked than the Moderate filters, although the percentages look accurate.

BT_blocked_domains_per_package

 

After a bit of digging I found a nightly data dump located that is linked to in their FAQ at https://api.blocked.org.uk/data/export.csv.gz (It looks like this dump is now dead and after a bit of searching I couldn’t find another copy…)

Then I noticed the API domain!!!! And did some searching leading me to a GitHub repo containing the code for the project!

Unfortunately the API is not exactly documented, but I am awaiting an email reply to see if this could be used.

The Extension Basics

Quickly looking at a basic implementation of the extension.

The extension needs to hold a list of domains (the most basic implementation), or be able to retrieve a list of domains that are blocked OR the extension needs to be able to query using a list of domains to see which are blocked.

The code to somehow mark sites as blocked in Google search results is very simple:

window.addEventListener(
    "message",
    function(e) {
        if(e.data.type == 'sr'){
            var results = document.getElementsByClassName("g");
            [].forEach.call(results, function(result) {
                var resultUrl = document.createElement('a');
                resultUrl.href = result.getElementsByTagName("a")[0].href;
                [].forEach.call(blockedDomains, function(blockedDomain){
                    if( resultUrl.hostname.substr(-blockedDomain.length) === blockedDomain ) {
                        var style = document.createAttribute("style");
                        style.value = "text-decoration: line-through";
                        result.attributes.setNamedItem( style );
                    }
                });
            });
        }
    },
    false
);

This simply strikes through the whole search result were blocked.
Screenshot of a Google search result that is struck through
Moving forward

It would be nice to be able to pass a list of domains to the blocked.org.uk API in order to see if they are blocked or not. An alternative to this would be the extension could allow the user to select what ISP they are with and thus a specific list of domains the blocked as a block list. The ISP could also probably be detected from the IP.

The extension should of course be able to work with multiple search engines, rather than just Google.

It might be nice to provide information about why the search result has been blocked alongside the struck-through entry.

Lets see if this goes anywhere! If it does the code will be on GitHub!