Solving the mystery of PRISM

(Image credit: REUTERS/Pawel Kopczynski)

last updated 11 January 2015

What exactly is PRISM? How does it work? Who uses it?

Let's assume that the companies whose data is sucked in by a National Security Agency tool called PRISM are denying their knowledge of the word and its associations in good faith. And let us also accept their denials that they've given someone at the NSA "direct access" to their servers.

Subscribe to The Week

Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.

SUBSCRIBE & SAVE

https://cdn.mos.cms.futurecdn.net/flexiimages/jacafc5zvs1692883516.jpg

Sign up for The Week's Free Newsletters

From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.

There are many types of nicknames and special words that the NSA uses.

Some refer to collection tools. Some refer to data processing tools.

Each data processing tool, collection platform, mission and source for raw intelligence is given a specific numeric signals activity/address designator, or a SIGAD. The NSA listening post at Osan in Korea has the SIGAD USA-31. Clark Air Force Base is USA-57.

PRISM is US-984XN.

Each SIGAD is basically a collection site, physical or virtual; the SIGAD alphanumerics are used to indicate the source of intelligence FOR a particular report.

The NSA often assigns classified code names to the product of SIGADs. These can be confused with the nicknames or proper names of the collection platforms themselves, which may or may not be classified. What PRISM does is classified; the fact that there is a "PRISM" tool that does something is not.

Analysts working on a problem can request that a particular collection site be tasked, or used. The form they fill out is known as an SP0200. Additionally, when they wish to discontinue using a SIGAD for a mission, they send in another SP0200.

To make things even more complicated, the NSA assigns every administrative and technical operation, location and cell its own alphanumeric designation. The NSA office that tasks and troubleshoots the SENIOR SPAN platform, attached to U2 spy planes, is known as G112. The agency's Special Collection Service, which operates out of embassies, is F6.

Other NSA nicknames refer to databases. "Marina" is a database for metadata collected from telephone records. Most database names are not classified, but their association with a particular technology or a dataset is classified.

That is, Marina=telephone metadata — classified. Marina by itself ... unclassified.

I think, but don't know, that the Verizon metadata contained in the FISC order we saw goes into the Marina database.

On top of this, for especially sensitive programs, like those involving analysis and collection of domestic telephone or email metadata, or those involving offensive cyberwarfare, the NSA creates "special access programs" that are identified by a code word, an unclassified nickname, and a digraph. The existence of these SAPs and their code words are classified TOP SECRET. Sometimes, small NSA collection cells access particularly sensitive or advanced collection platforms, like, say, tiny flying bugs. These technologies are not shared with every NSA collection cell; the technologies themselves are classified. (I don't know if the NSA actually uses tiny flying bugs).

So: An analyst sits down at a desk. She uses a tool, like PRISM, to analyze information collected and deposited in a database, like CONTRAOCTAVE. Then she uses another tool, perhaps CPE (Content Preparation Environment), to write a report based on the analysis. That report is stored in ANOTHER database, like MAUI. MAUI is a database for finished NSA intelligence products. Anchory is an intelligence community-wide database for intelligence reports.

If the analyst was analyzing the content of telephone traffic, he or she would access the desired traffic stream through the use of a "selector," which is the NSA's term for production lines. The stuff inside a selector comes from one or more SIGADs. A selector is kind of like an RSS feed that fills itself with content from several sources.

A system called XKEYSCORE processes most of the SIGINT traffic that comes into the NSA by way of various SIGADs, and compartmentalizes it by selector. A selector might be "RUSFOR," which would stand for Russian foreign ministry intercepts. Or something like that. Recorded signals intercepts are stored in a database called PINWALE.

This is all very complicated, and that is on purpose. But this brief tutorial is important. PRISM is a kick-ass GUI that allows an analyst to look at, collate, monitor, and cross-check different data types provided to the NSA from internet companies located inside the United States.

The programs that use PRISM are focused, as the government said yesterday, on foreign intelligence. A lot of foreign intelligence runs through American companies and American servers.

The chain of action works like this.

Under the FISA Amendments Act of 2008, the NSA and the attorney general apply for an order allowing them to access a slice of the stuff that a company like Facebook keeps on its servers. Maybe this order is for all Facebook accounts opened up in Abbottabad, Pakistan. Maybe there are 50 of them. Facebook gets this order.

Now, these accounts are being updated in real-time. So Facebook somehow creates a mirror of the slice of stuff that only the NSA can access. The selected/court-ordered accounts are updated in real-time on both the Facebook server and the mirrored server. PRISM is the tool that puts this all together. Facebook has no idea what the NSA is doing with the data, and the NSA doesn't tell them.

The companies came online at different points, according to the documents we've seen, maybe because some of them were reluctant to provide their data and others had to find a way to standardize their data in a way that PRISM could understand. Alternatively, perhaps PRISM updates itself regularly and is able to accept more and more types of inputs.

What makes PRISM interesting to us is that it seems to be the ONLY system that the NSA uses to collect/analyze non-telephonic non-analog data stored on American servers but updated and controlled and "owned" by users overseas. It is a domestic collection platform USED for foreign intelligence collection. It is of course hard to view a Facebook account in isolation and not incidentally come into contact with an account that is owned by an American. I assume that a bunch of us have Pakistani Facebook friends. If the NSA is collecting on that account, and I were to initiate a Facebook chat, the NSA would suck up my chat. Supposedly, the PRISM system would flag this as an incidental overcollect and delete it from the analyst's workspace. Because the internet is a really complicated series of tubes, though, this doesn't always happen. And so the analyst must sometimes "physically" segregate the U.S. person's data.

What happens if I, in America, tell my Pakistani friend via Facebook chat that I am going to bomb a bridge? We don't know precisely what happens when, in the course of a foreign intelligence intercept, a U.S. person creates evidence of their complicity with terrorism. The analyst must be able to distinguish between relevant and non-relevant communication. If the analyst catches my threat, then he or she will immediately initiate a procedure that sends the information to the FBI, which begins its own investigation of me. The NSA does not continue to collect on me. The FBI does — and probably uses the NSA tip as probable cause to obtain a FISA order to start collecting data using a PRISM-type tool of its own.

What if the location of the other person is unknown? The NSA has a tool called AIRHANDLER that helps them geolocate the origin of these special signals.

Here is an important thing to know: Everything the NSA analyst leaves an audit trail. And the NSA has a staff of auditors who do nothing but sample the target folders for over-collects.

There are many unknowns, of course, and many places where the system could break down. We do not know the minimization rules. They are highly classified. We do not know how long minimized data sits in storage. We don't know how many NSA analysts are trained to handle U.S. persons' data, or HOW they are trained. We don't know the thresholds to determine what the NSA finds to be relevant enough. We don't know how long the NSA can collect on a target without getting a FISA order, though we do know that they can start collecting without one if the circumstances demand it.

Marc Ambinder is TheWeek.com's editor-at-large. He is the author, with D.B. Grady, of The Command and Deep State: Inside the Government Secrecy Industry. Marc is also a contributing editor for The Atlantic and GQ. Formerly, he served as White House correspondent for National Journal, chief political consultant for CBS News, and politics editor at The Atlantic. Marc is a 2001 graduate of Harvard. He is married to Michael Park, a corporate strategy consultant, and lives in Los Angeles.