Making the most of OpenDataCommunities - part 2

Following yesterday's blog post, here's another quick example of retrieving our data directly into a Google Docs spread sheet.

I've developed a hopefully handy list of all Local Authorities, Unitary Councils, Counties, and London Boroughs.

The list also includes Fire, Police, Waste and Transport Authorities, plus National Parks.

It includes the URIs that we've established to define councils as organisations - i.e. as distinct from the geographic areas they serve, which are defined by Ordnance Survey and the Office for National Statistics.

The resulting Google sheet available for download here.

SPARQL enthusiasts can view and edit the query I'm using to populate the spreadsheet here.


Making the most of our data - combining and re-using multiple sources

This is the first of a series of articles outlining how you can get more out of OpenDataCommunities’ data sources.

In this article, I’ll focus on combining related datasets to create your own custom outputs. I’ll use the Help-to-Buy data to show you how to bring together DCLG’s statistics on completed Equity Loan sales, with HM Treasury’s on sales completed under the Mortgage Guarantee Scheme. Both sets are available at Local Authority and Postcode District levels.

The output will be this Google Spread Sheet, which you are free to download and re-use as you wish.

With help from Ric Roberts at Swirrl (thanks Ric!), I’ve developed a couple of queries (see below) to join together and retrieve data from the Equity Loan and Mortgage Guarantee datasets.

These queries use the SPARQL language. I won’t use this article to explain what SPARQL is and how it works: this will be covered in separate blogs from the OpenDataCommunities team. However, it is worth mentioning that although SPARQL is not for the faint hearted, it is similar to querying data using the more familiar SQL language and syntax. I’d encourage you to follow the links and experiment with these queries, and I would welcome feedback on how you get on.

If you're not comfortable working with SPARQL, why not try building your own datasets using our GeoSelector and Spreadsheet builder productivity tools

Importing OpenDataCommunities’ data into Google spreadsheets - how it works

Once I had developed my SPARQL queries, the next step is to embed them in a Google Docs spreadsheet. To do this, I used Google Spreadsheet’s IMPORTDATA() function, described here.

To get it working, we need the IMPORTDATA function to reference the URL to the SPARQL query which generates the CSV - i.e.

IMPORTDATA(“<URL to Sparql query results in CSV format>”)

You can see an example of this in Cell A6 of the "Postcode Districts" tab in the Google sheet.

To get the relevant URL, follow these steps:
1. When editing a SPARQL query - like this one, switch on "Expert mode"
2. Scroll down the page, to the "Alternative formats" section.
3. Copy the URL labeled "CSV".
4. Paste this URL into your IMPORTDATA() formula, remembering to surround it with double-quotes.

SPARQL queries - you can copy and paste these to http://opendatacommunities.org/sparql

Combining Equity Loan and Mortgage Guarantee data for Postcode Districts

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?pc_dis_lbl ?equity_loans ?mortgage_guarantees  
  # comment these two out if you dont want the obs URIs returned
  ?equity_loan_obs ?mtg_obs 

WHERE {

  # help to buy equity loans by district 
  # http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/loan-type/equity-loans-postcode-dis
   OPTIONAL {
      GRAPH <http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/loan-type/equity-loans-postcode-dis> {
      ?equity_loan_obs <http://opendatacommunities.org/def/ontology/housing-market/help-to-buy/numLoansObs> ?equity_loans .
      ?equity_loan_obs <http://opendatacommunities.org/def/ontology/geography/refArea> ?pc_dis .
      ?equity_loan_obs        <http://opendatacommunities.org/def/ontology/time/refPeriod> ?equity_loan_period .
    }
  }

# mortgage guarantees by district
# http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/mortagegaurantees/completed-loans-postcode-dis    
  OPTIONAL {
    GRAPH <http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/mortagegaurantees/completed-loans-postcode-dis> {
      ?mtg_obs <http://opendatacommunities.org/def/ontology/housing-market/help-to-buy/numLoansObs> ?mortgage_guarantees .
      ?mtg_obs <http://opendatacommunities.org/def/ontology/geography/refArea> ?pc_dis .
      ?mtg_obs <http://opendatacommunities.org/def/ontology/time/refPeriod> ?mtg_period .
   }
 }

# adds labels for regions, where we've got them.
  OPTIONAL {?pc_dis rdfs:label ?pc_dis_lbl .}

 #change this to get the periods you want. (note that the postcode district data are available for different periods to districts)
  VALUES ?equity_loan_period { <http://reference.data.gov.uk/id/gregorian-interval/2013-04-01T00:00:00/P16M> }
 VALUES ?mtg_period { <http://reference.data.gov.uk/id/gregorian-interval/2013-10-08T00:00:00/P8M23D> }      
}

ORDER BY ?pc_dis_lbl  

Combining Equity Loan and Mortgage Guarantee data for Local Authorities

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?district_lbl ?equity_loans ?mortgage_guarantees 
# comment these two out if you dont want the obs URIs returned
 ?equity_loan_obs ?mtg_obs 

  WHERE {

    # help to buy equity loans by district 
    # http://opendatacommunities.org/data/housing-market/help-to-buy/num-loans/loan-type/equity-loans-ons
     GRAPH <http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/loan-type/equity-loans-ons> {
      ?equity_loan_obs <http://opendatacommunities.org/def/ontology/housing-market/help-to-buy/numLoansObs> ?equity_loans .
      ?equity_loan_obs <http://opendatacommunities.org/def/ontology/geography/refArea> ?district .
     ?equity_loan_obs <http://opendatacommunities.org/def/ontology/time/refPeriod> ?equity_loan_period .
    }

     # mortgage guarantees by district
     # http://opendatacommunities.org/data/housing-market/help-to-buy/num-loans/mortagegaurantees/completed-loans-ons      
     GRAPH <http://opendatacommunities.org/graph/housing-market/help-to-buy/num-loans/mortagegaurantees/completed-loans-ons> {
      ?mtg_obs <http://opendatacommunities.org/def/ontology/housing-market/help-to-buy/numLoansObs> ?mortgage_guarantees .
?mtg_obs <http://opendatacommunities.org/def/ontology/geography/refArea> ?district .
      ?mtg_obs <http://opendatacommunities.org/def/ontology/time/refPeriod> ?mtg_period .
  }

 # adds labels for Districts where available.
  OPTIONAL {?district rdfs:label ?district_lbl }

 #change this to get the periods you want.
  VALUES ?equity_loan_period {             <http://reference.data.gov.uk/id/gregorian-interval/2013-04-01T00:00:00/P15M> }
  VALUES ?mtg_period { <http://reference.data.gov.uk/id/gregorian-interval/2013-10-08T00:00:00/P8M23D> }      
 }
 ORDER BY ?district_lbl

Improving data content: Help-to-Buy statistics

We’re conscious that OpenDataCommunities has implications not just for you as users of our data, but also for us as publishers of official statistics. So in a pioneering move, from 1st August 2014, we’ve been publishing DCLG’s official statistics on the cumulative number of Help-to-Buy sales exclusively on OpenDataCommunities.

The data are available for Local Authorities, Parliamentary Constituencies, and Postcode Sectors. We have also developed a demonstration application for you to explore the data, and inspire others to build innovative new tools and insights.

Help-to-Buy, Parliamentary Constituency Map and chart Our Help-to-Buy demonstration application

We’re adopting this approach because we believe OpenDataCommunities offers you much more flexibility in accessing our data that a typical Excel table does. Non-technical users can browse to exactly the Local Authorities they’re interested in using the Geo Selector. Developers can query our data directly, over the web (via SPARQL endpoint) and re-use it in their own applications as soon as it’s updated. The data is available for reuse in a wide variety of formats, including CSV, JSON and RDF.

Crucially, such an approach is also compliant with the UK Statistics Authority Code of Practice governing the release of UK official statistics. Principle 8, Frankness and Accessibility, states that official statistics should be “disseminated in forms that enable and encourage analysis and re-use.” OpenDataCommunities addresses this by offering users DCLG data in a variety of forms, including programmatic access.

We realise there is more we can do to make our data more accessible and reveal insight more readily. We hope that forthcoming improvements to OpenDataCommunities, such automatic mapping visualisations, will help. But as ever, we welcome your views on how we can improve further. Please leave your thoughts and ideas in the comments below, or email us at ODC@communities.gsi.gov.uk


Site re-launched with a new look and new features

OpenDataCommunities is all about helping you, the user, get the DCLG data you want, how you want, when you want. So over the last couple of months we’ve been working hard with our contractors Swirrl to redesign the site and add new features to help you do that more easily. Today we’re relaunching the site with the first fruits of our labour: a new look, a blog, a commenting feature and to come, automatic geographic visualisations of our data.

Screenshot of home page

First is the home page. We’ve added tabs at the top so you can quickly browse to the latest Data, News or Apps. We’ve added a blog too, which we’ll be using to update you with the latest developments, set out user guides to different aspects of the site and highlight interesting uses of our data via OpenDataCommunities. It’s also possible to browse to particular topics using tags.

We want to encourage your feedback on OpenDataCommunities and how well it meets your needs, so, secondly, we’ve introduced a commenting feature for each of our blog posts. Try it out at the bottom of this post!

Screenshot of Deprivation mapper

And thirdly, we’ve completely rebuilt the deprivation and wellbeing mappers using open street map data. This sets the stage for a particularly exciting development to come: visualisations that automatically map data. We know from OpenDataCommunities usage data how popular these mappers are with you all, so our hope is that introducing this feature will make our data more accessible, interesting and insightful still.

Finally, our Linked Data platform, powered by Swirrl's PublishMyData, has been upgraded to version 2, which includes many new features for data users. More on this soon.

Take a look for yourself and tell us what you think, either via the Disqus commenting feature below or by emailing us at ODC@communities.gsi.gov.uk.

Author
Peter Sheaf
Tagged
sticky