Thursday, April 09, 2015

Galaxy-Scope: Finding your virtual tribe (for example near the PayPal Mafia)

Whether it’s sitting in the same restaurant as George Clooney, or being on a picture with Warren Buffet, we are defined by whom we know and derive great satisfaction by being close to celebrities. Thanks to the Web and social media the six degrees of separation that separate any two people are shrinking rapidly. In addition, we can use the same insights to define who we are by looking at whom we are close to.

The following example illustrates how this idea can be applied to measure how close any aspiring Internet entrepreneur is to the “PayPal Mafia”.  In an article on April 1, 2015, the NYT describes the far reaching influence of PayPal alums in Silicon Valley. I was curious to measure the influence of the names listed in the article: 
Chad Hurley 
David Sacks 
Elon Musk 
Jawed Karim 
Jeremy Stoppelman 
Keith Rabois 
Max Levchin 
Peter Thiel 
Reid Hoffman 
Roelof Botha 
Russel Simmons 
Scott Banister 
Steve Chen 

I plugged them into our network data collectors for Wikipedia, the Web, and on Twitter. I did a Condor Coolhunting on the three infospheres using the names of the people as search terms.

The picture below shows their Wikipedia network:

Peter Thiel and Elon Musk lead the group, the rest is clearly recognizable, but none of them stands out. Thiel is close to Facebook, Musk to SpaceX. Chen and Hurley are close to YouTube, Levchin close to Yahoo, Stoppelmann and Simmons to Yelp.

The next pictures shows their importance in the Web:

The Degree-of-separation search uses the Google CSE API to collect the top 20 search results for each member of the PayPal Mafia, and then the top 20 links pointing back to each of the search results. Measuring the betweenness of the search results in the resulting bi-modal graph gives a proxy for the importance of each PayPal Mafia member in the Blogosphere as well as the most important Websites. Reid Hoffmann, Chard Hurley, Peter Thiel and Elon Musk are all similarly central, while and are the most central sites.

The Web Co-occurrence network shown above is constructed from the 3134 pages collected with the degree-of-separation search described above. Using named entity recognition and natural language processing, all people names are extracted from the thousands of pages collected. A link between two names is drawn if the two people are on the same page  - literally speaking. In this network, people like Barack Obama, Hillary Clinton, Steve Jobs, and Jon Stewart are more central than the members of the PayPal Mafia who were used to construct the network.

For the Twitter network, we combined the Tweets made of the members of the paypal mafia with all the tweets about them. An actor is a person tweeting, a link between two actors is drawn if a tweet is retweeted. Some people are very active tweeters, but are not so much tweeted about, others, like Peter Thiel, only tweeted once, but still has 90,000 followers, and is much tweeted about. And then there is Elon Musk, who does not tweet that much, but increased the value of his company Tesla by one billion with a single tweet.  By combining the two input sources, we get a Twitter network reflecting the real importance of the PayPal Mafia in the Twittersphere. It turns out that Peter Thiel and Elon Musk again rule the roost.

In the final picture we combined all of these networks (Wikipedia, Weblinks, Web Co-occurrence, Twitter). Peter Thiel and Elon Musk are the most important, taking their betweenness centrality as a proxy of importance. Compared to these two, all other members have considerably lower betweenness centrality.

As a second step, I was curious to see how ordinary people would fit in. Knowing fully well that nobody is “ordinary”, and everybody is “special”, this should tell us the “specialty” of each person in the context of big data and social networking entrepreneurship, also giving a metric on how important they are, and how close they are to the luminaries of the paypal mafia.

Eating my own dogfood, I started with myself. The picture below shows my personal network, cooked by the same recipe, combining my Wikipedia, Weblinks, Web Co-occurrence, and Twitter networks

As I do not have a Wikipedia entry, the pages on “Collaborative Innovation Networks” and “Coolhunting”, where I am mentioned, are the most central in the Wikpedia network. Also, as a passive tweeter, my tweet network is very small, so it is mostly the Web and Web content network that define my presence. That Barack Obama shows up, does not really mean that I have a personal relationship (I have not), but that we show up in the same texts occasionally.
The next picture combines my network with the network of the Paypal mafia.

I am not very close to any luminary, the happiness magazine is a surprising link (I do some research on human happiness, but was not aware of the link). Zooming in by eliminating the nodes with non-normalized betweenness lower 100,000 leads to the following network.

This is now much clearer, illustrating that my main presence on the Internet is the Collaborative Innovation Network entry, google Scholar, and arXiv, among others, plus a few common links with tweeters, linkedIn, YouTube with Peter Thiel, i.e. the same people mentioning him and me.

To compare it with some more prominent people, I repeated the same process for Hansjoerg Wyss, a prominent Swiss/American billionaire and philanthropist.

As a prominent member of the club of billionaires, Hansjoerg Wyss is much closer to his fellow billionaires Peter Thiel and Elon Musk, and also has some prominent links.

Not surprising, Forbes, which does the billionaire ranking, becomes now prominent, as well as some YouTube videos from the World Economic Forum where at times all of the people shown prominently on the map had some appearances.

This is a very short overview of a novel way of understanding somebody’s “tribe”, the context of how and where a person fits into the global social network that the Internet has become.
Ideas and feedback most welcome!