Health Innovation Lab provides Tom Clarke opportunities to see Leeds open data in action

manual data recordsby Tom Clarke

I recently attended a ‘Health Innovation Lab’ run by Student Data Labs, a Leeds based not-for profit organisation that seeks to teach students data science skills by working on projects that have social value. As an organisation it is well placed geographically; Leeds City council has a committed platform for hosting open data, which it hopes will generate value for local businesses and communities, and the Leeds-based Health and Social Care Information Centre serves as a repository for a wide range of health related open data.

The day itself was divided into two parts: a ‘learning lab’ in the morning, followed by an ‘innovation lab’ in the afternoon. There was a great emphasis upon how to ‘do’ things, and I was impressed with how much and how effectively this information was conveyed in a short space of time. In the later session, people arranged into teams, putting into practice the skills that were learned in the first half. The broad aim was to develop a tool to understand inequalities in type 2 diabetes prevalence in Leeds, and both sessions utilised open data towards this end.

I came away with several thoughts regarding the use of big and open data. As Lupton (2015) notes in her discussion of big data, there is much room for hubris, and the life changing aspect of this big data, in an everyday sense, does not seem immediately apparent to me. What is apparent though is that data science skills are in demand in the job market, and I was again impressed at how the event organisers signposted various job vacancies in which the skills that we had learned on the day are desired.

In terms of expertise itself it was interesting to reflect on the political nature of the event. A set of people who were neither clinicians nor epidemiologists, acting outside of a professional capacity and in their own time, were pooling expertise to tackle a wider social problem. The emphasis on openness, in terms of data but also software draws comparisons to other forms of citizen science in which ‘laypersons’ conduct scientific research. As such, this use of open data which could, in principle, occur anywhere, can be seen as a challenge to forms of expertise that operate in a physical location behind a paywall.

From another point of view however, the use of open data can reinforce rather than disrupt existing spheres of expertise. This was especially the case when it comes to questions of ill-health; any diagnostic ambiguity in the lab or clinic is lost, to be replaced by a present/absent conception of illness. Nor is this limited to biomedical knowledge. Indices of deprivation, for example, are similarly the product of processes that are undertaken by professionals with particular forms of expertise. As such, though the idea of citizens analysing and exploring data generated in their local area could be considered prima facie as a threat to traditional forms of knowledge, in practice they are reliant upon them.

As Lupton (2015) points out, the narrative of open data as liberating and creating value is open to the charge of being overly simplistic. Openness does not necessarily mean transparent, and issues such as a lack of standardisation, missing data and mess can obscure the picture that the data presents. On the other hand, too much data can also be a problem; suspicious accounting on row 134,219 of 200,000, column M is easily lost amongst the sheer mass of figures.

I have raised a few issues that I reflected upon after the event. Though they might seem critical, I must say that I enjoyed the event and I’m very much interested in learning more about the use of locally sourced big, open data. As an STS researcher, I’m interested in how research works in practice, and by directly observing how this data is made sense of by data scientists we can consider how these wider debates implicate themselves at a local level. On another level, it would be great to see what contribution sociologists can make to how this open data is published and utilised, and whether we can use available open data to contextualise our research in the local area.



