Can data save us from coronavirus?
Will data science save us from the pandemic?
Big data and machine learning — the twin engines behind the recent boom in artificial intelligence — have been touted in the tech world as technologies capable of delivering huge social benefit. Watching them being applied to a global health crisis unfolding in real time shows both their promise and their shortcomings.
Machine-learning systems employ a form of pattern recognition that is of broad use at a time like this, according to Fei-Fei Li, co-director of Stanford University’s Institute for Human-Centered AI. She was speaking at a hastily-arranged online conference this week to consider the many ways that AI is being brought to bear.
Speeding the search for new drugs, anticipating where scarce medical supplies will be most needed, understanding the spread of the virus and predicting the effectiveness of different actions to slow it: these are the kind of problems today’s deep-learning algorithms are meant to be able to handle.
What this crisis has shown is that making use of the technology is, at its heart, a data problem. The algorithms need to be trained on large amounts of information before they can be put to work. A novel coronavirus, by definition, is not something that has been seen before, which hampers their use. But there are also many institutional and technical barriers that make it hard to collect and apply information in the population-wide, real-time way that is needed.
The UK’s National Health Service, for instance, has just launched a project to break down the barriers between its various data silos so that it can track things like how many critical care beds it has available and the wait times in its accident and emergency departments. If a single national system has lacked this kind of insight into its own resources, the problem is magnified across a Balkanised, market-based health system like the American one.
It has also been difficult to generate a complete, real-time picture of the health of a population at large, and the factors likely to affect it. The US Centers for Disease Control only came up with a working model like this in the past two weeks, aggregating all the data it is able to collect down to the individual county and hospital level. Adding in other sources of information like travel patterns and local demographics helps to highlight where the epidemic might surface or hit hardest.
A natural institutional resistance, founded on protecting data privacy and security, has slowed initiatives like this in the past. But according to Ryan Tibshirani at Carnegie Mellon University, who monitors the incidence of flu-like symptoms across the US to predict when outbreaks will occur, the crisis has prompted data-sharing arrangements in the past two weeks that normally take years to agree. He singled out a Google survey, yielding 1m responses in a matter of days, as one such breakthrough.
It is too soon to tell whether these and many other attempts to break down the barriers and make existing bodies of information more useful are coming soon enough to have a marked impact on the fight against the threat from Covid-19. But they at least point to one silver lining from this pandemic. The new forms of data sharing they are forcing should provide a template for when the next health crisis hits, as well as better co-ordination inside and between healthcare systems to improve the quality of care even in normal times.
But they also highlight two big issues that reach far beyond the technology itself.
The Financial Times is making key coronavirus coverage free to read to help everyone stay informed. Find the latest here.
One is how far to push the data sharing. Once the peaks of the spreading infection have passed, societies around the world will be faced with the problem of spotting and quelling renewed outbreaks. That will require unprecedented forms of mass surveillance, as potential carriers of the virus are monitored and their personal contacts are traced to limit the spread. In the west, the debate around how to do this has barely begun.
The other issue concerns the uses to which the insights from data science are being put. AI alone is not enough: its effectiveness depends on how humans respond to the insights and recommendations that are produced by today’s probabilistic systems. Ultimately, it is down to today’s political leaders to summon the wisdom and moral determination to act in the best interests of their whole populations.