It’s still unclear exactly how the National Security Agency (NSA) is carrying out digital surveillance on us. But we know one thing for sure: the government is collecting a whole lot of data. And privacy concerns aside, the real challenge for the NSA is not so much collecting that information but figuring out how to use it to help keep the country safe–and doing it against the clock while minimizing mistakes. “This is a big-data challenge,” says Viktor Mayer-Schönberger, the Oxford Internet Institute’s professor of Internet governance and regulation. “You have lots and lots of noise with a potential signal buried inside, but it’s hard to differentiate the two.”
With the national-security establishment still a black box–albeit a leaking one–it’s worth looking at how private companies are grappling with those big-data challenges. From logistics firms trying to keep millions of packages moving on schedule to airlines trying to predict delays, businesses are working to make sense of their own vast data sets. And they’re turning to specialized data-consulting firms that have the expertise to pull the signal from the noise through something called complex event processing. “There’s so much data flowing around now and a huge need to analyze it,” says Matt Quinn, the CTO of Silicon Valley enterprise-software firm the Information Bus Co. (TIBCO). “That’s led to companies like us developing the technology to take advantage of it.”
How does it work? Say you’re a financial firm looking to detect fraud. You may have as many as 300,000 transactions per second, each of which can be considered an event. And each of those events has countless data points that go along with it–the size of the transaction, its type, its location. Out of that overflowing stream of data, complex event processing tries to pull out recognizable patterns that can alert you to aberrations. And it has to happen in near real time–a fraud alert that goes off days after the theft will do little to prevent loss. “What you try to do is correlate those events into something larger,” says Quinn.
Sifting for patterns and trying to predict the future isn’t new for businesses or governments. What’s different is the sheer scale of the data that’s being collected in a connected world–and the computing power available to mine that information. In 2008, engineers at the NSA began developing Accumulo, a program that allows the agency to store and analyze vast amounts of data across thousands of computer servers. How much data isn’t known, though the NSA is building a sprawling $2 billion computer center in Utah. Accumulo could help the NSA make real-time connections among the data points it collects–say, linking phone calls in the U.S. to terrorist chatter overseas. “There is definitely the technology in place that allows you to search that information very quickly and draw correlations about it,” says Joseph Turian, an analyst for GigaOM Research and president of the consulting firm MetaOptimize.
In fact, some of that NSA-derived technology has already made its way to the private sector. Sqrrl, an enterprise-software start-up in Cambridge, Mass., was founded by some of the same engineers who developed Accumulo for the NSA. The company is not connected to the NSA beyond those origins, but it’s now bringing that database power to civilians, including the ability to create massive indexes that can be searched in much the same way that the Web is. “The whole purpose we have is to bring more data sets together and to make it less expensive to build applications out of them,” says Ely Kahn, a co-founder of Sqrrl.
As databases continue to grow, big data will only get bigger. “The availability of the data leads to more tools to analyze it, and the availability of the tools leads to more collection of data,” says Chris Soghoian of the American Civil Liberties Union. “It’s an unpleasant circle.” And that creates one more big-data challenge, one that can’t be answered with algorithms: how it should be used.
More Must-Reads from TIME
- How Kamala Harris Knocked Donald Trump Off Course
- Introducing TIME's 2024 Latino Leaders
- George Lopez Is Transforming Narratives With Comedy
- How to Make an Argument That’s Actually Persuasive
- What Makes a Friendship Last Forever?
- 33 True Crime Documentaries That Shaped the Genre
- Why Gut Health Issues Are More Common in Women
- The 100 Most Influential People in AI 2024
Contact us at letters@time.com