As part of its efforts to better utilize the overwhelming flow of information it collects, the National Security Agency has reportedly been supporting the development of new technology and data management techniques by funding grants given by the Advanced Research Development Activity (ARDA). ARDA is an intelligence community organization whose mission is described as “to sponsor high-risk, high-payoff research designed to leverage leading edge technology to solve some of the most critical problems facing the Intelligence Community (IC).” ARDA’s research support is organized into various technology “thrusts” representing the most critical areas of development. One of ARDA’s current research thrusts is the Novel Intelligence from Massive Data program (NIMD).
- Novel intelligence refers to “actionable information not previously known.”
- Massive data refers to data that has characteristics that are especially challenging to common data analysis tools and methods. These characteristics can include unusual volume, breadth (heterogeneity), and complexity.
Data sets that are one petabyte (one quadrillion bytes) or larger are considered to be “massive.” Smaller data sets that contain items in a wide variety of formats, or are very heterogeneous (i.e., unstructured text, spoken text, audio, video, graphs, diagrams, images, maps, equations, chemical formulas, tables, etc.) can also be considered “massive.”
According to ARDA’s website (no longer available) “some intelligence data sources grow at a rate of four petabytes per month now, and the rate of growth is increasing.” With the continued proliferation of both the means and volume of electronic communications, it is expected that the need for more sophisticated tools will intensify. Whereas some observers once predicted that the NSA was in danger of becoming proverbially deaf due to the spreading use of encrypted communications, it appears that NSA may now be at greater risk of being “drowned” in information.