For the early part of its existence, IBM’s Watson supercomputer was a bit of a carnival act. It could perform feats of computational magic, win on Jeopardy, and whip up crazy burrito recipes at SXSW. But Watson is designed to become IBM’s money-making, Big Data platform, earning its keep across a variety of industries. In New York, the company announced that a Watson-enabled group of researchers was able to speed the process of discovery to uncover new targets for cancer research.
“We’re moving from a time where Watson helps answer questions to one where it tackles the questions that don’t have answers,” says IBM vice president John Gordon, Watson’s boss.
Using a Watson app developed with Baylor College of Medicine called KnIT (Knowledge Integration Toolkit) that reads and analyzes millions of scientific papers and suggests to researchers where to look and what to look for, a Baylor team has identified six new proteins to target for cancer research. How hard is that? Very. In the last 30 years, scientists have uncovered 28 protein targets, according to IBM. The Baylor team found half a dozen in a month.
More than 50 million research papers have been published, and that is doubling every three years. “Not only are our databases growing; they are growing faster than we can interpret all the data that they contain,” says Dr. Olivier Lichtarge, a computational biologist and professor of molecular and human genetics at Baylor Med who is one of KnIT’s developers.
Lichtarge and colleagues used KnIT to read 23 million MedLine papers, including 70,000 studies on a protein called p53, which is a tumor suppressor. The p53 protein is associated with half of all cancers. They also looked at other proteins called kinases—there are more than 500 of them in humans—that act as switches in turning p53 off and on. In cancer, mutations cause the switching function to go haywire, which lets cancer cells run amok. Using the KnIT analytics, the team was able to identify six previously unknown kinases that affect the p53 protein.
It sounds like Google for scientists—which already exists—but Watson’s calling card is its natural language and cognitive abilities. The program doesn’t just sift through the literature and spit out the search matches—it interprets the papers, looking for previously unseen connections involving proteins, drugs and molecular mechanics. Then it builds a graphic analysis to help the researchers see those connections. “You are not looking for an answer,” says Gordon. “You are looking for a chain across the papers. If we were playing pool: you would see all the direct shots. What would be less obvious are the combinations.”
At the end of the data-mining and analyses, Watson generates hypotheses for the scientists to consider, along with the probabilities that it has picked the right targets.
To test the process, researchers cut Watson’s reading material off at 2003, and then asked it to suggest protein targets to investigate. It came up with nine. Over the next decade, seven of them were actually discovered.
For IBM, it’s a kind of road test of Watson Discovery Advisor, a cloud-based service that the company is launching. The target: some $600 billion is spent annually on research and development by large corporations. IBM sees thousands of applications in everything from finance, engineering and science to law enforcement–basically any place where data is piling up faster than humans can absorb it. Other companies are doing likewise of course, as Big Data has the potential set off another wave of expansion in cloud services.
Watson has its limits. It isn’t going to do the scientists’ homework, the nuts and bolts of research; nor is it going to replace scientific intuition. “Let me be clear that nothing replaces good critical reading, in depth, by a specialist of a research paper,” says Lichtarge. “It doesn’t tell the scientist what to do: it suggests possibility,” he adds. Watson may be recommending bank-shots in the game of medical research, the scientists are still going to have to make them.