Wikihistory – Finding the World’s Leaders through the Ages through Wikipedia Social Networks
All software development has been done by Patrick de Boer
The goal of this project at the MIT Center for Collective
Intelligence is to create an interactive history book of the most important
people of all times from Wikipedia. In a first step towards that goal, we focus on the English Wikipedia, extracting its 800,000
people pages. In future work we intend
to repeat this process with other language Wikipedias, to get an understanding
of the key influencers over time in different cultures.
In this first prototype created from the English Wikipedia, all people pages are dated,
by extracting the dates of birth and of death of each individual. Moreover, the links originating and pointing to their Wikipedia page are gathered. Using this information, 4900 networks
through history, from 3000 BC to 1900 CE are calculated, as shown in figure 1.
From all the links originating and pointing back to a particular people page,
only the links to and from people living at the same time as the person
discussed on that page are included.
For instance, in the graph shown in figure 1 above, from
all the links to the page about Plutarch, only the links from and to Hadrian,
Caesar, and Nero are kept, while the links to Pyrrhus, who died well before
Plutarch was born, and the pages to medieval historian Syncellus and modern
historian Pisani are ignored as well. Repeating this process leads to 4900 unique networks. For each of these networks, the most central people are determined using the popular PageRank algorithm. To get a second selection criteria among all the influencers, their
indegree, i.e. other people pages pointing back to them, is taken. The following list
shows the top 50 most influential people of all times, ranked by the
Wikipedians, ordered by pagerank and indegree (the number in this list).
name
|
indegree
|
PageRank
|
|
1
|
George_W._Bush
|
4721
|
1
|
2
|
William_Shakespeare
|
3914
|
1
|
3
|
Sidney_Lee
|
3093
|
1
|
4
|
Jesus
|
2176
|
1
|
5
|
Charles_II_of_England
|
1519
|
1
|
6
|
Aristotle
|
1400
|
1
|
7
|
Napoleon
|
1361
|
1
|
8
|
Muhammad
|
1123
|
1
|
9
|
Charlemagne
|
949
|
1
|
10
|
Plutarch
|
925
|
1
|
11
|
Julius_Caesar
|
890
|
1
|
12
|
William_III_of_England
|
890
|
1
|
13
|
Homer
|
820
|
1
|
14
|
Bede
|
799
|
1
|
15
|
Athanasius_of_Alexandria
|
775
|
1
|
16
|
Dante_Alighieri
|
755
|
1
|
17
|
Gautama_Buddha
|
747
|
1
|
18
|
Tiberius
|
697
|
1
|
19
|
Cyril_of_Alexandria
|
684
|
1
|
20
|
Bernard_of_Clairvaux
|
655
|
1
|
21
|
Moses
|
645
|
1
|
22
|
Tacitus
|
610
|
1
|
23
|
Edward_III_of_England
|
582
|
1
|
24
|
Justinian_I
|
532
|
1
|
25
|
David
|
522
|
1
|
26
|
Ashoka
|
486
|
1
|
27
|
Origen
|
337
|
1
|
28
|
Septimius_Severus
|
334
|
1
|
29
|
Polybius
|
307
|
1
|
30
|
Confucius
|
302
|
1
|
31
|
Alexander_Severus
|
278
|
1
|
32
|
Patriarch_Eutychius_of_Alexandria
|
276
|
1
|
33
|
Tutankhamun
|
253
|
1
|
34
|
Akhenaten
|
238
|
1
|
35
|
Ramesses_II
|
228
|
1
|
36
|
Pope_Benjamin_I_of_Alexandria
|
172
|
1
|
37
|
Teti
|
151
|
1
|
38
|
Amenemhat_II
|
146
|
1
|
39
|
Pepi_II_Neferkare
|
145
|
1
|
40
|
Merneith
|
144
|
1
|
41
|
Terence
|
142
|
1
|
42
|
Cato_the_Elder
|
141
|
1
|
43
|
Charles_Martel
|
116
|
1
|
44
|
Gilgamesh
|
101
|
1
|
45
|
Deborah
|
89
|
1
|
46
|
Lugalbanda
|
68
|
1
|
47
|
Kubaba
|
65
|
1
|
48
|
Fu_Xi
|
12
|
1
|
49
|
Henry_I_of_England
|
417
|
0.986383431
|
50
|
Petrarch
|
254
|
0.981669694
|
These influencers consist primarily of politicians (kings and
generals, in red), second of religious leaders (black), and third of poets and
historians (blue). It seems it pays to be a historian, to write one's own place in history. This is clearly shown by Sidney Lee, a relatively minor Victorian professor of English and history, who wrote 800 biographies.
These networks can now be used to construct snapshots
of social networks of the key leaders through the ages. The following picture
shows the Wikipedia link network of 3000 BC to 2000 BC.
As we can see, the Egyptian Pharaohs dominate history in
that age, complemented by a tight cluster of Sumerian kings, and a cluster of
Chinese kings and princes.
Skipping 1000 years ahead, looking at 1000 BC to 0 BC, Alexander
the Great is the dominant figure, surrounded by a tight cluster of patricians of the Roman Republic. The Chinese emperors form a group at the top,
while the Indian emperor Ashoka is surrounded by other influencers from the
Indian subcontinent.
Making another huge leap to 1800 CE, looking at the 19th
century, the US takes center stage: Abraham Lincoln is the most influential
person, surrounded by a roster of US poets and scientists. Queen Victoria,
other European policians and scientists form their own, smaller and less tight-knit cluster, while
Chinese and Southeast Asian kings occupy comparatively peripheral positions.
We can also combine these networks in a movie over centuries, below is the world’s leaders from year 0 to year 500, calculated with Condor. As the movie shows, in that age we have two dominant clusters with the Roman and Chinese emperors in the center. From 200 to 300 CE, the Chinese Golden Age of the Han dynasty, the Chinese cluster clearly surpasses the Roman cluster.
If there is one lesson from this preliminary experiment, it is the disproportionally huge role of the historians. Not only is a minor 19th century biographer under the top 10 influencers of all times (which is of course more an artifact of our collection method), but also classical historians like Polybius, Tacitus, and Plutarch get very high ranks. Treating biographers and historians well so they write positively about world leaders is of course no new insight, for instance Roman emperor Vespasian was paying historians Tacitus, Suetonius, Josephus and Pliny the Elder, in return they speak suspiciously well about him, shaping his image in history. Caesar and Winston Churchill took this concept one step further, writing their history themselves. As todays history is written in Wikipedia, the conclusion seems obvious: treat Wikipedians well!
That's really interesting! :) For those who are interested, I'd like to point towards http://pantheon.media.mit.edu that has a similar methodology.
ReplyDelete