Better medicine through machine learning?Data can be a dump or a gold mine
Internet entrepreneur Daniel Faggella argues that Big Data and machine learning could turn the vast data dump of current pharmacy and medicine into a gold mine generation $100 billion annually, through “better decision-making, optimized innovation, improved efficiency of research/clinical trials, and new tool creation for physicians, consumers, insurers, and regulators.”
According to a 2015 report issued by Pharmaceutical Research and Manufacturers of America, more than 800 medicines and vaccines to treat cancer were in trial. In an interview with Bloomberg Technology, Knight Institute Researcher Jeff Tyner stated that while this is exciting, it also presents the challenge of finding ways to work with all the resulting data. “That is where the idea of a biologist working with information scientists and computationalists is so important,” said Tyner.
In principle, sure. By the time an oncologist has located a clinical trial in which a challenging patient might participate, the patient’s clinical profile might have worsened, rendering her unsuitable.
But in today’s environment, there should be privacy concerns. Consider:
Google’s strength is data. So, maybe this is relevant: Last year DeepMind, an English artificial intelligence company that’s also part of Alphabet (so, Google) cut a deal with the National Health Service to share data in return for an app and AI brainpower to treat acute kidney injury. But the privacy details weren’t handled well. The company didn’t get full consent for data sharing, and it wasn’t clear how else the company might use the patient data. “At the moment it’s just, get in there, gain the first mover advantage, and build networks and knowledge about illness and disease,” says Julia Powles, a researcher at Cornell Tech who has written about DeepMind and NHS. “It’s Alphabet, the most powerful company in the world. They can afford to do that for a while, and that looks indistinguishable from the public interest.” Adam Rogers, “That Google Spinoff’s Scary, Important, Invasive, Deep New Health Study” at Wired
There’s another issue as well. Faggella believes that,
Machine learning has several useful potential applications in helping shape and direct clinical trial research. Applying advanced predictive analytics in identifying candidates for clinical trials could draw on a much wider range of data than at present, including social media and doctor visits, for example, as well as genetic information when looking to target specific populations; this would result in smaller, quicker, and less expensive trials overall. Daniel Faggella, “7 Applications of Machine Learning in Pharma and Medicine” at techemergence
But many problems with clinical trials fall outside the purview of machine learning. Two big ones are
1) Replication studies (can we make it work a second time?) are in short supply. Many promising treatments are one-shot wonders. But replication studies, with whatever result, are hard to get funding for (risky but not glamorous) and difficult to publish.
2) Some research areas may be degenerating paradigms in science. Human nutrition is a good candidate. Studies rely mainly on an unreliable source, self-report. Although the monitoring proposed by Google for AI-assisted studies might help reduce that problem, the chance that the study subjects will find ways to cheat the monitors when they are tempted to stray are pretty high. (Of course, nutrition studies involving caged lab animals don’t suffer from this problem but their findings have not, by definition, been tested on humans.)
Both of these problems stem from moral choices—in the one case, the choice to avoid the consequences of bad news by avoiding the news and in the other case, to avoid the need for behaviour change by telling researchers what they want to hear.
In short, the biggest problem today isn’t the sheer mass of data so much as the difficulty of determining what it is worth. The answer lies, unfortunately, in the undone studies and the unreported events. Machine learning will be a much greater help when those problems are addressed.