Friday, August 10, 2012

#ALGORITHMS: "Sparse Inference Uncovers 9/11 Perpetrators"

In hindsight its sometime easy to see hot the dots could have been connected if only we have known which dots were the important ones. Now a new algorithm called Sparse Inference claims to have succeeded in predicting the source-location of terrorists, spammers, malware, biological epidemics and even the most important blogs for a marketing campaingn: R. Colin Johnson

The ringleader of the hijackers involved in the September 11th 2001 attack, Mohamed Atta, could have been identified from two wiretaps (green) using the Sparse Inference algorithm which identified three possible sources (red) one of which was Atta. SOURCE: EPFL

The complex webs of interactions that distinguish the propagation of malware, spam, biological epidemics and even terrorism plots is extremely difficult to analyze. However, the inventors of a new algorithm at the École Polytechnique Fédérale de Lausanne (EPFL) claim to be able to track down perpetrators using just a few sources--even claiming that the mastermind of 9/11 could have been identified from just two wire taps.

"The mastermind of 9/11 was Mohammed Atta, which was already known," said inventor of the algorithm, EPFL researcher Pedro Pinto. "What we have shown is that monitoring the communications of just a few terrorists and applying our source-location algorithm could have led to the same result. And in other scenarios, it can be used for prevention as well."

The algorithm called "Sparse Inference" (SparseInf) makes accurate source-location predictions using a multi-dimensional version of the algorithm that cell phone carriers use to triangulate the location of mobile handsets.

"It was inspired by how localization works in wireless networks, where three or more base-stations measure the distance to your cell-phone, and use triangulation to pinpoint it's location. We just do something similar, but on a graph," said Pinto.

Sparse Inference used historical data to trace the source of a cholera outbreak that occurred in the KwaZulu-Natal province, South Africa, in 2000.

Using historical databases the researchers have shows how the algorithm could have been used to quickly find the source-location of a whole variety of hard-to-trace examples--from epidemic outbreaks in Africa to the source-location of the sarin nerve gas that killed 13 and injured nearly a 1,000 in Tokyo in 1995. They claim that the algorithm could also be used to trace the source-location of malware and spammers online, and speculate that in the future, even businesses could use the algorithm to identify the Internet blogs that are most influential for their target audience.

"The algorithm relies on the principle of 'maximum likelihood hypothesis testing', adapted to arbitrary graphs, so it's pretty general in applicability," said Pinto.

Currently, the team is attempting to adapt SparseInf to be used preemptively, to make important predictions before they materialize--from epidemic outbreaks to terrorism plots to finding the sources of an Internet rumors to identifying blogs key to a successful marketing campaign.
Further Reading