The Sopranos may be over, Tony might be dead, but my love for it will never die.
It’s been 10 years and I still cannot get over the ending for HBO’s show, The Sopranos. DELAYED BY 10 YEARS SPOILER ALERT but I when everything went black on my TV, so did a little piece of my heart. The leading theory about this ending is of course that Tony Soprano, the head of the DiMeo crime family was killed at the conclusion of the series. Now as anyone can tell you, that doesn’t mean the end for the DiMeo family. The mob does not die with one man. This leaves my little data science brain wondering… who’s next to take over?
If we want our computer to take a stab at answering that question we really should turn to social network analysis. It’s obvious from watching the show that the FBI has collected an obscene amount of data about Tony and the organization. They had informants, they tapped phones, they had surveillance… there is no shortage of data here. Odds are, they created some sort of visual representation of Tony’s network. On TV you often see this represented on some bulletin board in a detective’s briefing room with strings connecting the pictures like this:
In real life, these diagrams are created by computers using programs like IBM i2 Analyst’s Notebook. This amazing program takes data from spreadsheet form to make sense of the connections between people and visualize their network like this:
These visualizations are only the beginning. Social network analysis would allow the FBI to not only see the organization and understand it more, but also to automatically calculate things like direction between people (ie who is initiating the contact between two players), it would help to detect bridges, or people who serve as a vital connection between clusters such as families working together, and centrality… that critical piece that may help the FBI determine who is next to take over as they have a great deal of influence over the organization.
Nowadays the FBI would have even more to work with then they did back in 07 if they wanted to take down Tony Soprano. His daughter’s twitter account alone would probably be a goldmine. I would guess that i2 would still be used today. Open source algorithms such as pagerank would possibly be put to use by a clever FBI analyst as well. (PS – wanna try pagerank? Check it out here: https://github.com/IBMPredictiveAnalytics/MLlib_Pagerank . If you REALLY want to try tweet me at @ericareuter to get some free services…)
Who know what the Sopranos investigation would bring in current times, but if someone out there has created it – I’d certainly be interested in taking a look! Send me your algorithms. I can’t get enough of that stuff!
Erica Reuter is the co-host of Unsupervised Binge Watching, a podcast about the (data) science of good TV. Tune in to episode one to hear more about social network analysis and its application to the Sopranos.