Under scrutiny: questions hang over how data are shared in the name of science and by whom
Under scrutiny: questions hang over how data are shared in the name of science and by whom © Getty Images

When Julie Parker was invited to join the advisory board of Insight, an organisation that oversees a trove of NHS eye scans and other images for research, she accepted without hesitation. The retired insurance loss adjuster is living with macular degeneration, which robs sufferers of their central vision. This gave her a special stake in the kind of breakthroughs this data might yield.

Parker’s unpaid role points to a new model of data stewardship that is designed to allay privacy concerns. These have heightened in recent years as the role of tech giants in aggregating and profiting from personal data has come under increasing scrutiny.

Parker is part of a team that reviews applications for the eye-scan data set from NHS, academic and industry researchers. The team acts as an intermediary, judging requests against published criteria, including whether each application poses a significant risk to individuals’ privacy and if the outcome strikes a “balance between public good, scientific discovery and value generation”.

In this way, Insight acts as a “data trust” — and the use of such trusts has been singled out by The Lancet and Financial Times Commission on Governing Health Futures 2030 as a promising model for the future. There are, however, questions about how, and by whom, key decisions on data sharing are made — and whether a model that originated in the wealthy west is appropriate for the developing world.

Fitbit wearers can donate health data for research through Open Humans

Insight is one of nine data hubs funded through the UK Health Data Research Alliance (HDR UK), an independent grouping of healthcare and research organisations that is seeking to establish best practice for the ethical use of UK health data. David Seymour, executive director, says the positives and the risks around data use have been “brought into sharper focus and amplified” by Covid-19. The pandemic has not only expanded the amount of information available but has also led to quicker and larger-scale sharing of data under the exigencies of the public health emergency.

The chief risk, Seymour suggests, relates to “public perception and understanding . . . because when decisions are made quickly, perhaps they’re not always communicated as transparently and/or the sort of public involvement in those decision-making processes isn’t always as strong as it should be”.

The data trust model, he says, can be vital to offering a necessary layer of reassurance to the public, “not only about who gets access but under what terms that access is granted”. Seymour sees data trusts as connected to a wider approach known as trusted research environments. HDR UK recently defined principles and best practices for these “data safe havens” that provide researchers with a single location to access valuable data sets “similar to a secure reference library”.

Parker says she and her colleagues “ask a lot of questions” about each request they receive from research teams. While they have yet to turn down a request outright, a couple have been returned for further information. “We don’t reject out of hand, because that’s wrong. But we do want to be reassured [that the data will be safe]”, she says.

Jack Hardinges, programme lead for data institutions at the Open Data Institute (ODI), a non-profit that advocates an “open, trustworthy data ecosystem”, says it is important not to delineate the data trust concept too narrowly. Hardinges, who is responsible for the ODI’s work on data stewardship, suggests it has come to be defined in a way that is “specific and niche, in that it’s about creating a particular type of trustee relationship and trustees manage data on behalf of a group of individuals”.

Jack Hardinges, programme lead for data institutions at the Open Data Institute (ODI)
Jack Hardinges, programme lead for data institutions at the Open Data Institute (ODI)

He notes that other approaches to data stewardship are also emerging, such as the one espoused by Open Humans, a US-based organisation that allows individuals to donate data from wearables such as Fitbits, or medical records, “and . . . ensure that it’s used for research into a particular condition or cause. It’s about bottom-up empowerment of individuals to exert control over data about themselves rather than deferring that control to someone else.”

Hardinges adds that for data trusts: “Who is doing the trusteeship around the data is important. We shouldn’t inherently trust it just because it’s called a data trust.”

Such caveats may be even more pertinent in the developing world. Amandeep Gill, one of the moving spirits behind the International Digital Health & AI Research Collaborative (I-Dair) — which is developing a global platform “to enable inclusive, impactful and responsible research into digital health and AI for health” — says the main question is: “How is the thing being governed and on whose behalf?”

In Africa and Asia, there are concerns that data may be handed to western researchers with no clear route for the people who generated the information to benefit. Gill has seen such sensitivities heighten over the past few years. “There’s a risk that this whole conversation about data trusts turns into: ‘Give us your data and we will solve your problems for you,’” he says. “And there might be a sort of neocolonial tinge to it.”

The resulting backlash risks fuelling “a form of data localisation [or] data nationalism”, Gill adds. To avoid this, I-Dair is pursuing a “distributed, decentralised approach — almost like confederating data assets”.

An example of I-Dair’s work involves nationally held data sets covering antimicrobial resistance. Authorities in Singapore and in India, for example, have retained sovereignty over their data but have agreed to share them for research after mutually defining the problem the data are intended to solve and jointly working on an algorithm to analyse them. “The AI that’s developed is also done collaboratively, so that you can trust it,” Gill says.

Like Hardinges, Gill highlights a model in which citizens come together to generate the data needed for a goal to which they themselves subscribe. An example of this approach in Europe is Midata, based in Switzerland.

Dominik Steiger, a member of Midata’s management committee, describes his organisation as “a data trust organised as a co-operative”. The idea that citizens or patients should have a say in how their data are used is rooted in the idea that personal data are a resource or an asset. “And that the expectation, or the rights, of people to . . . decide what happens with their data is something that has to be built into the data ecosystem”, Steiger says.

Some 20,000 people have shared their data with Midata, although not all will choose to participate in every project. In one example, people were given an app to record pollen allergy symptoms. “This is citizen science and the data that people generate will belong to them . . . and then they consent to this being used in an anonymous fashion at an allergy study,” Steiger says. He suggests this model can offer a distinctively European approach to data stewardship as an alternative to the behaviour of the US tech giants.

In Europe, he adds, “there is a strong move towards seeking solutions that are more trustworthy, which better represent, or enable participation by, individuals. We have one answer, which fulfils these criteria, and we hope it gives inspiration for such models.”

Copyright The Financial Times Limited 2022. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Follow the topics in this article