Scotland’s Open Data, February 2019. An update.

urban decay

Scotland’s provision of open data may be slowly improving, but it is a long way behind the rest of the UK. In my most recent trawl through websites and portals I found a few minor improvements, which are positive, but progress is too slow; some data providers are slipping backwards; and most others are still ignoring the issue altogether. Now is the time for the Scottish Government to act to fix this drag on the Scottish economy and society, and stop inhibiting innovation.

Latest review

Over the last week, I have conducted yet another trawl of Scottish Open Data websites and portals. I keep this updated on this Github Repo.  I’ve carried out this research without assistance, in my own time. The review could be more comprehensive, frequent and robust if I was supported to do it.

This work builds on previous pieces of research I’ve carried out and articles that I have written. Recently, I’ve created an index of those blog posts here as much for my own convenience of finding and linking to them as anything.

During this latest trawl, I’ve tried to better capture the wide spread of Scottish Government departments, agencies, non-departmental public bodies, health boards, local authorities, health and social care partnerships and academic institutions;  and assess each sector using quite conservative measures.

The output of that, as we will see below, does not paint a good picture of Scotland’s performance, despite a few very good examples of people doing good work despite a clear policy gap.

Let us look at this sector by sector, following the list of findings here.

Local Authorities

Of Scotland’s 32 local authorities, only 19 produce open data of any kind.  This group uses a mixture of open data portals (10), web landing pages (7) and GIS systems (2). This leaves 13 who produce no open data whatsoever.

Those 19 councils (ignoring the other 13) produce a total of 731 datasets, giving a mean for the group of 38 and a median of 17 datasets. This total is only six more than I found three months ago, despite Dumfries and Galloway launching a new portal with 33 datasets !

Also, stagnation is a real issue. For example, it is worth noting once again that while Edinburgh produces an impressive 234 open data sets, only five of those have been updated in the last six months, and 228 of them date from 2014-2017.  While there is a value in retaining historic data ( allowing comparisons, trends etc to be analysed), the value of data which is not being updated diminishes rapidly.

When I ran the OD programme for Aberdeen City Council (which, like all Scottish councils, is a unitary authority), based on some back-of-the-envelope calculations I reckoned that we could reasonably expect to have about 250 data sets. So, if each of the 32 did the same, as we would expect, then we’d have 8,000 datasets from local authorities alone. This puts the 731 current figure into perspective.

Scottish Government

So far, I have found the following open data being produced:

  • 248 datasets on the excellent, and expanding, Statistics.Gov.Scot portal  covering a number of departments, agencies and NDPBs,
  • 54 datasets on the Scottish Natural Heritage portal, 53 of which are explicitly covered by OGL and one marked “free to use data.”
  • At least 43 OGL-licensed mapping layers on the Marine Scotland portal
  • Just four geospatial datasets for download on the Spatial Hub
  • Six Linked open data sets, licensed under OGL, on the SEPA site.
  • Great interactive mapping of the Scottish Indices of Multiple Deprivation, for which the source Data is included above on the Statistics Portal mentioned above.

That makes a total of 353 datasets. I’ve not tracked these number previously, so can’t say if they are rising, but there certainly appears to be good progress and some good quality work going on to make Scottish Government data available openly. This includes the four newly-opened sets of boundary data by the Spatial Hub, out of 33 data sets.

However, if we look at the breadth of agencies etc that comprises the Scottish Government, it is clear that there are many gaps. In addition to the parent body of the Scottish Government there are a further 33 Directorates, 9 Agencies, and 92 Non-Departmental Public Bodies. That’s a total of 135 business units.

Let’s assume that they could each produce a conservative 80 data sets, and it is arguable that that should be considerably higher, then we’d expect 10,800 datasets to be released. Suddenly, 353 doesn’t seem that great.

Health

Scotland’s Health service is composed, in addition to the parent NHS Scotland body, of 14 Health Boards and 30 joint Health and Social Care Partnerships. That gives a total of 45 bodies.

Again, taking the same modest yardstick, of 80 open data sets for each, we would expect to see 3,600 data sets released.

What I found was 26 data sets on the new NHS Scotland open data portal. This is a great, high-quality resource, which I know from conversations with those behind it has great commitment to adding to its range of data provided.

However, given our yardstick above, we are still 3,574 data sets short on Scottish Health data.

Higher and Further education

Scotland’s HE / FE landscape comprises of 35 Universities and colleges.

Glasgow and Edinburgh Universities each have an open data publication mechanism for data arising out of a business operation, which contain interesting and useful data.

Despite that, there is no operational, statistical or other open data being created by any universities or colleges that I could identify. Again, using the same measure as above, that produces a deficit of (80 x 35) or 2,800 datasets.

Supply versus expectation

If we accept for the moment that the approximate number of data sets that we might expect in the Scottish public sector is as set out above, and that the current provision is, or is close to, what I have found in this trawl, then what is the over all picture?

Sector Published Expected Defecit
Local Government 731 8000 7,269
Scottish Government 353 10,800 10,447
Health 26 3,600 3,574
FE / HE 0 2,800 2,800
Totals 1,110 25,200 24,090

Table 1: Supply versus expectation of Scottish public sector Open Data

As we can see from the table above, it appears that the Scottish public sector is currently publishing 1,110 of 24,090 expected open data sets. This is just 4.6%. So, by those calculations, more than 95% of data that we might reasonably expect to see published as Open Data is not being released.

Scotland is behind the UK generally

Whether you agree with the exact figures or not, and I am open to challenge and discussion, it is clear that we are failing to produce the data that is badly needed to stimulate innovation and deliver the economic and social benefits that we expected when set out to deliver open data for Scotland.

I’ve long argued that in terms of the UK’s performance in Open Data league tables, such as the Open Data Barometer, Scotland is a drag on the UK’s performance, with Scotland’s meagre output falling well short of the rest of the UK’s Open Data.  In addition to existing approaches, we should see Scotland’s OD assessed separately, using the same methodology, in order to be able compare Scotland with the UK as a whole. That would allow us to measure Scotland’s performance on a like-for-like basis, identify shortfalls and target remedial action where needed.

Policy underpinning

I have argued previously that a significant issue which stops the Scottish public sector getting behind open data is the lack of public policy to make it happen, as well as an ignorance, or denial, of the potential economic and social benefits that it would bring. While I was part of the group who wrote the Scottish Government’s 2015 Open Data Strategy, it was, in its final form, toothless and not underpinned by policy.

We now have an Open Government Action Plan for Scotland 2018-2020 (PDF). This is  great step forward but unfortunately it is almost entirely silent on Open Data, as pointed out in my response to the draft in November 2018.

Even when Open Data does make an appearance, on page 19, it is relation to broader topic rather than forming actions on its own merits.  The position is similar in the plan’s detailed commitments.  This is not to denigrate the work that has gone into these, and the early positive engagement between Scottish Government and civic groups, but this is a huge missed opportunity – and we should not have to wait until 2020 to rectify it.

At this point, it is worth contrasting this with the Welsh Government’s Open Government plan 2016-2018 which was reviewed recently (PDF). In that plan, Open Data was the entire focus of the first two sections, and covered pages 4 to 6 of the plan. This was no afterthought: it was a significant driver and a central plank of their open government plan.

The broader community

Scotland still lacks a developed Open Data community. This will come in time as data is made more widely available, is more usable and useful – and also through the engagement with the Open Government process  – but we all need to work to develop that and accelerate the process. I set out suggestions for this in a previous post.

There are significant opportunities to grow the use of open data through the opening of private sector and community-generated and -curated data.

The universities and colleges in Scotland should be adopting open data in their curriculum, raising awareness among students, creating entrepreneurs who can establish businesses on the back of open data.

Schools should be using open data to get their classes involved: using it to explain their environment, climate, and transport system; to understand local demographics, the distribution of local government spending, or comparative attainment of schools.

Government should be  developing the curriculum to use open data to foster a better understanding of data and how it underpins modern society.

There are some positive things going on: the roadshows that the Scottish Government are doing, as well as other Data Fest Fringe events; the regular data hack weekends we’ve been doing in Aberdeen under the Code The City banner; and the major long-term project to build and deploy community-hosted air quality monitoring sensors which provide open data for the local community. These need to become the norm – and to be happening across the country.

Organisations such as The Data Lab, Censis and other innovation centres have a great opportunity here to advance their work, whether in education, community building or fostering innovation, and to support this to achieve their organisational missions.

Bringing people together

Having earlier created a Twitter account for a nascent Scottish Open Data Action Group (@Soda_group), I have reconsidered that. Instead of an action group to pressure, shame or coerce the Scottish Government into action, what we need is a common group that has the Scottish Government onside – and everyone works together. So I have renamed it @opendata_sco. It already has 179 followers and I hope that we can grow that quickly, and use that to generate more interest and engagement.

I have also launched a new open Slack channel for Open Data Scotland, so that a community can better communicate with one another.

Please join, using this form.

As I have said previously this isn’t a them-and-us, supply-and-demand relationship. We’re all in it together, and the better we collaborate as a community the better, and quicker, society as a whole benefits from it.

========================================

Header photo by Andrew Amistad on Unsplash

Response to Scotland’s Draft Action Plan on Open Government

The Scottish Government published its draft action plan on 14th November 2018. You can find it here. They are seeking feedback before the 27th November 2018.

Here is my feedback which I sent on 25th November.


Thank you for the chance to feed back on the drafts of the Scottish Open Government Action Plan and Commitments.

These documents are welcome and while they certainly set a path for moving Scotland further in the right direction in terms of openness and transparency, we should remember that those should not be our only aims. We need to ensure that we also address the need to use data and information to fuel innovation, and deliver societal and economic benefits for Scotland.

I have set out below my observations and suggestions in a number of areas which range from the general to the specific.

The public good

Data and information held by the Scottish Government and the public sector should be considered a Public Good. See https://www.nic.org.uk/wp-content/uploads/Data-for-the-Public-Good-NIC-Report.pdf and https://www.gov.uk/government/publications/data-for-the-public-good-government-response/government-response-to-data-for-the-public-good.

To deliver that public good requires freeing up information and data as a matter of course, rather than by exception.

There is one simple thing that could be done with immediate impact, and minimal effort, to free up large amounts of data and information for public re-use: adopt an Open Government Licence (OGL) for all published website information and data on the Scottish Government’s website(s), and other public sector sites, the only exception being where this cannot legally be done, as would be the case when personal data is involved.

The ICO’s own website (http://www.itspublicknowledge.info/home/TermsAndConditions.aspx) takes this approach: “Where the Commissioner is the copyright holder, information is available through the Open Government Licence. This means you have a worldwide, royalty-free, perpetual, non-exclusive licence to use the information, subject to important conditions set out in the licence.”

At present, websites operated by Scottish Government, local authorities, health boards etc.  all appear to have blanket copyright statements. I certainly could find no exception to that. With OGL-licensed content, where data is not yet available as Open Data (OD), a page published as HTML could be legitimately scraped and transformed to open data by third parties as the licence would permit that. Currently pages such as this list of planning applications, https://publicaccess.aberdeencity.gov.uk/online-applications/simpleSearchResults.do?action=firstPage contain valuable data but are caught by default, site-wide copyright statements.

Of course, in reality citizens, companies, universities and organisations do scrape website content, but it is done under the radar. This approach results in repeated scraping as the results are not published as open data, and there is consequently limited public benefit. Switching the licensing model to OGL by default, and copyright by exception,  would solve this and encourage both innovation and engagement: moving a supplier / consumer relationship to one where data and information are a shared public good.

The Scottish Government should mandate this approach not just for the whole of the public sector but also for companies performing contracts on behalf of Government, or who are in receipt of public funding or subsidy.

Targets for publishing

The Scottish Government’s own Open Data Strategy 2015 commits it to publishing data openly but despite my efforts and those of other contributors to it, the strategy mostly lacks hard targets, and sets overly-modest goals: “The ambitionis for all data by 2017 to be published in a format of 3* or above.”  One could ask if all of Scottish Government’s data wasactually published to 3* standard by the end of 2017. If not, how much? Who knows – is this even measured, reported on or published?

Therefore, any new action plan should have harder, more specific targets. It is arguable that the lack of these, and of a clear Open Data Policyfor Government, as I called for in 2015, allows overly-pressed civil servants to have much less focus on publishing open data than is needed, resulting in inadequate resources being applied to that. So, ideally this action plan should be underpinned by policy for the whole of the Scottish public sector to ensure that effort and resource can be targeted on publication.

To support this, the public benefits of open data publishing, both in social and economic terms, should be made clear to all data publishers.

Every FOI request should be assessed on receipt, identifying whether it is for data or whether data publishing would satisfy that and future similar requests. If so, the data set should be set for publication as OD with regular periodic updates.

Statutory obligations

I looked for, but could not see, in the action plan and other document, an acknowledgement  of the current statutory obligations on the Scottish Government in this area. Recognising, noting and commenting on these in the document would be a useful reminder of specific existing obligations but would also strengthen broader arguments for OD. The following list is not exhaustive.

There are obligations under the G8 Charter on Open Data https://www.gov.uk/government/publications/open-data-charter.

Further, there are existing clear obligations under The Re-use of Public Sector Information Regulations (2015) https://www.legislation.gov.uk/uksi/2015/1415/contents. There is a handy guide here:

http://www.nationalarchives.gov.uk/documents/information-management/psi–guidance-for-public-sector-bodies.pdf (see pages 22 onwards in particular).

Where specific legislation mandates open publication then this should be made clear, as is the case, for example, under The Public Services Reform (Scotland) (2010), if only to avoid this type of headline: https://www.heraldscotland.com/news/17238918.snp-ministers-missing-their-own-transparency-target/

Another example is the OECD’s “Compendium of good practices on the publication and reuse of open data for Anti-corruption across G20 countries: Towards data-driven public sector integrity and civic auditing”.

https://www.oecd.org/gov/digital-government/g20-oecd-compendium.pdf

Recommendations and best practice

There are many resources available online which demonstrate best practices which Scotland’s public sector should adopt in order to deliver the aims of the action plan. Again, these should be mandated for adoption in the action plan. Some examples follow.

Discoverability

A key part of publishing information and data openly is discoverability. To do this well means understanding and applying best practices. Having standard identifiers, descriptors, taxonomies etc. will aid discoverability.  So, all information and data publishing should use best practice, using the correct metadata and appropriate standards such as DCAT / DCAT-AP / DCAT.json.

There are some useful resources to assist in this such as

The Scottish Government has an internal expert on this, who sits on the international standards board. It is imperative that his input is sought, and implemented rigorously, in terms of this application of standards.

Data as infrastructure

We should acknowledge the concept of data as infrastructure. See https://www.nic.org.uk/wp-content/uploads/Data-As-Infrastructure.pdf and https://theodi.org/topic/data-infrastructure/. Publishing to our best ability, based on standards and best practice will allow new products and services be developed for societal and economic benefit, and support innovation.

Reference Data

By using standard identifiers for things, such as UPRNs for properties, USRNs for roads and so on, data from multiple government sources can be aggregated about that object, and we can link items with certainty. If the identifiers are then made public, external data such as those from the private sector, can be amalgamated. There must be a concerted effort to make these identifiers public and re-usable. Instead of what appears to be a starting position of “we can’t do this because of x ” we must shift to “how can we do this and how can we sweep away barriers?” Where no identifiers exist for a specific domain, but it is identified that there would be benefit from having them, these should be created.

General approach to open data

Open Data is not a separate thing or process. The curation, management and publication of data is a continuum starting with the internal processes of the organisation. OD should be seen as the natural end point for all data where it is appropriate to publish openly. By adopting an open data by default approach, as outlined here  https://en.wikipedia.org/wiki/Open_by_default effort is expended on publishing, not on finding a reason or way to publish: data will be published as OD unless there are specific legal reasons why it can’t be. There are additional benefits to this, including improvements in data quality, de-duplication  and re-use of data internally by other departments or services.

Further, while the draft action plan focuses on statistical data, it needs to be recognised that while publishing statistical data openly, the scope needs to be so much wider: encompassing all branches of the Scottish Government, its directorates, its NDPBs, and other agencies. SG also needs to act as a leader to health boards, local authorities, and to joint health and social care partnerships, and work with others such as Scottish Cities alliance where work is ongoing.

We need to open up reference data, geographical boundaries, transactional data, financial data, in fact anything that need not be closed by default.

National portal

Scotland lacks a national open data portal. While this is not a necessity, in order to aid discovery, it would be an advantage, particularly when we have a growing number of existing places where data is being published across Scotland. Many other countries have national portals (https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/ ) and some such as Austria have had a federation model of publishing at various levels of government in place for many years. If we get discoverability right, and tools such as Google’s data search engine (https://www.google.com/publicdata/directory) begin to mature, this may be less of an issue.

Geospatial commission

Both the recently-formed geospatial commission and the rapidly changing stance of Ordnance Survey is going to impact on what we can publish – with barriers being removed. This increased liberalism will mean that data which we could not publish 3 months ago will suddenly be publishable. Scottish Government need to be on top of that and acting on it to push out data as soon as it can. Beyond that, they should be routinely pushing OS on issues such as derived data to ensure that barriers to publishing are actively removed. Similarly, if reference data is opened up at a UK level, then the Scottish portion of that data needs to be highlighted by the Scottish Government.

Community Building

The action plan must include commitments to work with the Open Data community in Scotland. It is smaller than it should be since there has been relatively little data of value to work with up to now. Contrast with the position of Transport For London, one single organisation, whose open data as far back 2013 was reported to be responsible for 5,000 developer jobs and 500 apps. The Scots Govt needs to grow the OD community and develop it by being an active part of it; to actively seek input on what data sets would be most useful, to use the community as a sounding board; to gain the trust and support of the community by empowering them to be infomediaries who will build and develop products and services which enable citizens to use the data produced, and make sense of it.

Supporting education

Finally, the publication of open data needs to be seen as an educational resource too. Data should be available for use by schools, colleges and universities. Curricular development should encompass the use of open data. Outreach should work with teachers and lecturers so that children can understand their locality by using data pertinent to them. Honours-year and post-grad students in computing sciences should use open data in their projects. Innovation and entrepreneurship courses should encourage the use of public data. Journalism courses should teach data journalism, and so on.

Ian Watt

etc