Tuesday, February 18, 2014

Wikihistory – Finding the World’s Leaders through the Ages through Wikipedia Social Networks


All software development has been done by Patrick de Boer

The goal of this project at the MIT Center for Collective Intelligence is to create an interactive history book of the most important people of all times from Wikipedia. In a first step towards that goal, we focus on the English Wikipedia, extracting its 800,000 people pages. In future work we intend to repeat this process with other language Wikipedias, to get an understanding of the key influencers over time in different cultures.

In this first prototype created from the English Wikipedia, all people pages are dated, by extracting the dates of birth and of death of each individual. Moreover, the links originating and pointing to their Wikipedia page are gathered. Using this information, 4900 networks through history, from 3000 BC to 1900 CE are calculated, as shown in figure 1. From all the links originating and pointing back to a particular people page, only the links to and from people living at the same time as the person discussed on that page are included.


For instance, in the graph shown in figure 1 above, from all the links to the page about Plutarch, only the links from and to Hadrian, Caesar, and Nero are kept, while the links to Pyrrhus, who died well before Plutarch was born, and the pages to medieval historian Syncellus and modern historian Pisani are ignored as well. Repeating this process leads to 4900 unique networks. For each of these networks, the most central people are determined using the popular PageRank algorithm. To get a second selection criteria among all the influencers, their indegree, i.e. other people pages pointing back to them, is taken. The following list shows the top 50 most influential people of all times, ranked by the Wikipedians, ordered by pagerank and indegree (the number in this list).


name
indegree
PageRank
1
George_W._Bush
4721
1
2
William_Shakespeare
3914
1
3
Sidney_Lee
3093
1
4
Jesus
2176
1
5
Charles_II_of_England
1519
1
6
Aristotle
1400
1
7
Napoleon
1361
1
8
Muhammad
1123
1
9
Charlemagne
949
1
10
Plutarch
925
1
11
Julius_Caesar
890
1
12
William_III_of_England
890
1
13
Homer
820
1
14
Bede
799
1
15
Athanasius_of_Alexandria
775
1
16
Dante_Alighieri
755
1
17
Gautama_Buddha
747
1
18
Tiberius
697
1
19
Cyril_of_Alexandria
684
1
20
Bernard_of_Clairvaux
655
1
21
Moses
645
1
22
Tacitus
610
1
23
Edward_III_of_England
582
1
24
Justinian_I
532
1
25
David
522
1
26
Ashoka
486
1
27
Origen
337
1
28
Septimius_Severus
334
1
29
Polybius
307
1
30
Confucius
302
1
31
Alexander_Severus
278
1
32
Patriarch_Eutychius_of_Alexandria
276
1
33
Tutankhamun
253
1
34
Akhenaten
238
1
35
Ramesses_II
228
1
36
Pope_Benjamin_I_of_Alexandria
172
1
37
Teti
151
1
38
Amenemhat_II
146
1
39
Pepi_II_Neferkare
145
1
40
Merneith
144
1
41
Terence
142
1
42
Cato_the_Elder
141
1
43
Charles_Martel
116
1
44
Gilgamesh
101
1
45
Deborah
89
1
46
Lugalbanda
68
1
47
Kubaba
65
1
48
Fu_Xi
12
1
49
Henry_I_of_England
417
0.986383431
50
Petrarch
254
0.981669694


These influencers consist primarily of politicians (kings and generals, in red), second of religious leaders (black), and third of poets and historians (blue). It seems it pays to be a historian, to write one's own place in history. This is clearly shown by Sidney Lee, a relatively minor Victorian professor of English and history, who wrote 800 biographies.

These networks can now be used to construct snapshots of social networks of the key leaders through the ages. The following picture shows the Wikipedia link network of 3000 BC to 2000 BC.

As we can see, the Egyptian Pharaohs dominate history in that age, complemented by a tight cluster of Sumerian kings, and a cluster of Chinese kings and princes.

Skipping 1000 years ahead, looking at 1000 BC to 0 BC, Alexander the Great is the dominant figure, surrounded by a tight cluster of patricians of the Roman Republic. The Chinese emperors form a group at the top, while the Indian emperor Ashoka is surrounded by other influencers from the Indian subcontinent.


Making another huge leap to 1800 CE, looking at the 19th century, the US takes center stage: Abraham Lincoln is the most influential person, surrounded by a roster of US poets and scientists. Queen Victoria, other European policians and scientists form their own, smaller  and less tight-knit cluster, while Chinese and Southeast Asian kings occupy comparatively peripheral positions.



We can also combine these networks in a movie over centuries, below is the world’s leaders from year 0 to year 500, calculated with Condor. As the movie shows, in that age we have two dominant clusters with the Roman and Chinese emperors in the center. From 200 to 300 CE, the Chinese Golden Age of the Han dynasty, the Chinese cluster clearly surpasses the Roman cluster.



If there is one lesson from this preliminary experiment, it is the disproportionally huge role of the historians. Not only is a minor 19th century biographer under the top 10 influencers of all times (which is of course more an artifact of our collection method), but also classical historians like Polybius, Tacitus, and Plutarch get very high ranks. Treating biographers and historians well so they write positively about world leaders is of course no new insight, for instance Roman emperor Vespasian was paying historians  Tacitus, Suetonius, Josephus and Pliny the Elder, in return they speak suspiciously well about him, shaping his image in history. Caesar and Winston Churchill took this concept one step further, writing their history themselves. As todays history is written in Wikipedia, the conclusion seems obvious: treat Wikipedians well!

1 comment:

  1. That's really interesting! :) For those who are interested, I'd like to point towards http://pantheon.media.mit.edu that has a similar methodology.

    ReplyDelete