From Pine View Farm

Data Mining 3

A number of folks think that the current Federal Administration’s aversion to legitimatizing its eavesdropping with warrants is that it is not doing targeted eavesdropping, but that it is “data mining.”

This post gives an extremely clear description of data mining:

Most Americans don’t understand what is meant by “data mining,” a digital “fishing expedition.” To clarify what’s happening, I’ve created a hypothetical scenario: Suppose that my daughter is in India visiting her in laws. I call her to wish her a happy birthday. Instead of dialing the country code for India, 91, I dial the country code for Pakistan, 92; and instead of dialing the city code for Bombay, 22, I dial the city code for Karachi, 21. I make a transposition error. As a result, instead of getting my daughter’s extended family, I reach the office of the “Jihad R Us” madrasah. When I am greeted in Urdu, I realize my mistake and hang up. Meanwhile, the National Security Agency computers monitoring international calls note that I have contacted “an organization that is affiliated with al-Qaida.”

In addition to phone traffic, the NSA computers have access to all my personal and financial data. Alerted by my call, their software scans my digital records and finds that I recently made a contribution, by means of my Visa card, to an organization in Pakistan – they don’t care that it was for humanitarian assistance to earthquake refugees. At this point, the NSA computers flag me as a “possible terrorist sympathizer;” their software decision logic, their algorithm looks at my data and computes my “threat score” – much like the FICO score assigned to determine credit worthiness. Because I have a threat score above a certain threshold, the NSA software makes an algorithmic decision to monitor my phone calls, read my email, and check my financial transactions. The NSA computer system goes “fishing” in my personal data. This is “data mining;” looking for patterns in massive amounts of data. In this case the NSA software is looking for actions – phone calls, money transfers – that indicate an al-Qaida supporter.

In other words, they are looking for anything.

And they haven’t found it.

As we say in the computer biz, it’s led to a flood of data, but no infomation. But it gives the current Federal Administration the masturbatory idea that it’s doing something useful.

Face it, these are folks who don’t read.

They sacrifice our privacy to their incompetence.

And the sad truth is that, when they take our privacy, it does not reduce their incompetence.



  1. Opie

    February 7, 2006 at 6:17 pm

    I have a funny feeling that this guy really does not know diddly-squat about what goes on inside the NSA. The two most honest words he used in the whole column were “hypothetical scenario.” He tries to give a tone of being some kind of armchair expert on these “data mining” activities, but he doesn’t even mention the likely NSA use of voice-print recognition to reduce false positives. Methinks he’s more bitter than knowledgable.

  2. Frank

    February 7, 2006 at 7:42 pm

    Perhaps. I’m not an expert on this. I do think it lays out the possibilities, but I could be quite wrong. I’ve been wrong before. It was back in aught nine, as I recall . . .

    Nevertheless, I don’t see how false positives would come into play here. His example did not refer to false identification–rather, to accurate identification of mistaken entries.

    Also, how would voice-print affect mining emails and internet searches? I recall our conversation about trying to read the Koran:

    But, as far as I am concerned, fallibility or infallibility of the technology is not the issue.

    The issue is that the Current Federal Administration has no respect for the rights of citizens of the United States of America nor for the Constitution of the United States of America.

    They are lawbreakers.

  3. Opie

    February 7, 2006 at 10:29 pm

    Well it seems to me that he is talking about how the technology will at times flag someone as a terrorist when they aren’t. This is true, but my point is that if you design a system like this and it starts sending you on wild goose chases in pursuit of honest, harmless citizens, you’re going to want to refine the system to minimize that kind of inefficiency.

    As far as the voice analysis issue goes, I took him to at least hint that the e-mails, online transactions and tapped phone conversations were all going to be cross-referenced in some way to come up with a total surveillance package on a suspect, in which case it seems to me they’d give extra weight to the voice-print technology in screening out bad flags.