Why The Washington Post's big NSA revelation is a bust
While the world awaits Glenn Greenwald's long-promised final scoop, in which actual targets of the National Security Agency's post-9/11 surveillance programs might be identified, The Washington Post pushed out a dramatic story of their own on Sunday. Based on 160,000 transcripts of intercepted conversations, The Post found that "nine of 10 account holders found in a large cache of intercepted communications ... were not the intended surveillance target but were caught up in a net that the agency had cast for someone else."
And here's the piece's jaw-dropper:
Many of them were Americans. Nearly half of the surveillance files, a strikingly high proportion, contained names, e-mail addresses, or other details that the NSA marked as belonging to U.S. citizens or residents. NSA analysts masked, or "minimized," more than 65,000 such references to protect Americans' privacy, but The Post found nearly 900 additional e-mail addresses, unmasked in the files, that could be strongly linked to U.S. citizens or U.S.residents. [The Washington Post]
Some of the most trenchant critics of the NSA were giddy. It's now been proven, Conor Friedersdorf at The Atlantic wrote, that:
The NSA collects and stores the full content of extremely sensitive photographs, emails, chat transcripts, and other documents belong to Americans, itself a violation of the Constitution—but even if you disagree that it's illegal, there's no disputing the fact that the NSA has been proven incapable of safeguarding that data. There is not the chance the data could leak at sometime in the future. It has already been taken and given to reporters. The necessary reform is clear. Unable to safeguard this sensitive data, the NSA shouldn't be allowed to collect and store it. [The Atlantic]
Except — not so fast.
Here's a little bit of neutral context:
The intercepts leaked by Snowden contain information derived from both the PRISM program, which collects ephemera directly from service providers and is based on FISA court orders for specific foreign targets, and from its cable interception program, which taps directly into the internet and sucks ephemera through a very complicated filtering process, getting rid of as much non-target communication as the technology is capable of filtering, and back through to the NSA, where analysts can figure out what's going on.
The NSA's "Turbulence" architecture filters for selectors at the point of interception, and indeed, even before: The NSA tries to infect computers and networks, "tag" certain communications of interests from targets, and then have its hardware "read" those tags as the internet traffic they're bundled with streams through a node the NSA happens to be collecting from.
Staying neutral here, the trouble and much of the controversy stems from the NSA's inability to determine which communications are "foreign," and its compliance problems have highlighted a real and troubling gap between what its technology was actually capable of and what the NSA assumed it was doing. Each NSA product line is an IP geolocation cell that works solely to determine the foreignness and origin of communication.
Now, I'm going to abandon my neutrality. Here's the scenario portrayed in the article: The NSA begins to collect content from a target, either because of a FISA order or a "702" certification on a class of targets — like "Russian government officials living in Utah." The NSA tries to eliminate as much of the targets' e-mails and chats to people inside the United States automatically, unless they're looking for a spy, in which case the FBI could be handling the case under separate authority. Ok, so: having run the data through an automatic minimization system of some sort, the NSA analysts are required to minimize every U.S.-person communication that they see. Minimize does not "to get rid of." It means to anonymize the U.S.-based non-target source. A minimized NSA intercept from me to a Russian government official in Utah might look like this:
>To: Sergey Federov (Sfederov@juniper.com)
I unfortunately won't be attending the Aspen Ideas festival to talk about espionage and surveillance this year, or about XXXXXXXX XXXXXXX.
Minimized: My name and that of the U.S. president.
Maybe I could be a customer service representative from the pizza place that got his order wrong, and I'm e-mailing him to apologize for it. The NSA and the FBI are required by statute to minimize the communication if they determine it has no intelligence value. (And why would the NSA waste time reading a conversation about pizza anyway?)
The Post article says that nine out of 10 account holders in the cache were not targets, and half of them contained U.S. person communication, most of which had been minimized. The nine out of 10 figure shouldn't surprise anyone. Look at your own inbox. If the FBI obtained your e-mail because you were involved in a plot to rob banks, at least nine out of 10 account holders would not be targets. Because warrants don't authorize the FBI to only read the content of the e-mails you might have sent to your associates, they authorize them to read the content of your e-mails, period, to investigate. That's intrusive, but that's why the FBI has to swear out an affidavit and get a warrant. (Here's a better explanation). When it comes to foreign targets, the NSA needs either a FISA order or certification to begin the process. We treat U.S. citizens differently than we do foreign targets.
Half of the the Snowden cache contained U.S. person communication of some sort. Again, not really surprising, especially if the intercepts came from the PRISM program. People around the world often communicate with companies here in the U.S. (which DOES count as a U.S person), or with friends from college who live here, and it is not even remotely remarkable that a foreign target's stored e-mails would include a "selector" that might be based in the U.S. or a U.S. person's name in the conversation itself.
The Post article says that MOST of those communications were minimized, according to law. But 900 or so identifiable U.S. person terms/selectors/account holders were not. This means that, in the above example, maybe the NSA minimized my e-mail address but not my telephone number. Why might the NSA not have minimized this? Because doing so would require an analyst to look at the content of every communication and run every selector (telephone number and e-mail address) through a database to figure out if it should be minimized. This would be inordinately time-intensive, so the NSA relies on automation, and THEN, when the analyst IS looking at specific communication, the analyst is required to minimize any un-minimized selector that makes it through.
The analyst's judgment can be subjective. On the first instance, the analyst has to figure out whether the communication is relevant to a foreign intelligence purpose. Then, they must figure out whether the selector terms, names, and identities are foreign and domestic. If the specific communication IS relevant to a foreign intelligence purpose, the analyst will spend time analyzing it.
How might U.S. persons flow through the NSA systems without being minimized? Easily. The communication simply wasn't looked at. No human being saw it. The Post's reporters looked at every single line of 160,000 intercepts. The NSA analysts don't do that/can't do that because the SIGINT system would not function for a second if they did.
Is it chilling that The Washington Post now has these intercepts? Yes. Does it represent a huge failure by the NSA? Debatable. The person who obtained them originally, Edward Snowden, spent more than a year, with very high clearances, trying to figure out how to steal them without triggering alarms. To say that they weren't protected by the NSA is to blame your grandmother for keeping her purse in a simple combination lock safe the kitchen, and not the thief who broke into the house to steal it. (In this case, the thief cased the house for a long time and had to figure out the combo to the lock.)