Fact-checking Fake News - "It's easy to lie with statistics; it is easier to lie without them."
What is fact? And what is fiction? What might be seen as a
fact by one person is seen as fake news by somebody else. Depending on
political orientation and cultural background people quickly categorize news as
fake or fact.
When beginning of November 2016 right-wing fanatics
constructed “pizzagate”, they were claiming that owners and customers of a
popular pizza restaurant in Washington were running a covert pedophile
operation, directed by a group of people around Hillary Clinton. The mainstream
press agreed that this was a fake news smear campaign constructed to damage
Hillary Clinton’s reputation and the liberal agenda. Nonetheless, a significant
group of the US population took the rumor at face value, see my previous blogpost.
Even “facts” published in highly respected newspapers such
as the New York Times can be seen as fiction by other news media. For instance,
in a recent article in the New York Times, whistleblower
Ed Snowdon was depicted as a puppet of Russian spy agencies in a report
produced by US government agencies. The
report listed various claims by US intelligence agencies as “facts”, which,
according to other journalists, were not true.
In God we trust. All others must bring data (W. Edwards Deming)
To make sense out of emerging news and to decide whether to
categorize them as fact or fiction, it would be useful to track their origin
and identify the main promoters of a particular news item. Harvard statistician
Gary King and his colleagues have done as much tracking the flow of fake
news in China. According to Chinese urban myths, there are up to 2 million
microbloggers in China who are paid “50cent” per post by the Chinese government
to drown out critical voices on social media and spread news favorable of the
government. In a research paper, King and his team have been identifying the
“50cent” microbloggers spreading news supporting the Chinese communist party on
Sina Weibo and other Chinese blogs. King and his team grouped the posts into five
categories: (1) taunting of foreign countries, (2) argumentative praise, (3)
non-argumentative praise, (4) factual reporting, (5) cheerleading. Using sophisticated
statistical and machine learning methods mining an e-mail archive leaked from
the Internet Propaganda office from Zhanggong district, they showed that these
“50cent” bloggers primarily engage in a massive amount of positive cheerleading
with little to no central oversight, to some extent debunking the urban myth of
a vast shadow army of bloggers at the beck and call of the Chinese government.
However, the key problem with the analysis of Gary King and
his team is that the analysis tools they used are so complex that only somebody
with a graduate degree in statistics has a chance to understand it, and nobody
except the team doing the analysis has the full insight into the results. As Winston
Churchill reputedly said “Do not trust any statistics you did not fake
yourself.” The average reader thus has close
to zero chance to actually understand why the statisticians came to their
conclusion. It therefore boils down to trust: does the reader trust the conclusions
of the analyst/statistician/journalist?
Faith-based and Science-based Belief Systems
As has been repeatedly shown, humans are much more likely to
trust and accept as true news close to their own beliefs and values. What this
means is that it depends very much on the belief system of an individual
whether a particular news item is accepted as fact or as fiction. Each individual has to decide for her or
himself what is fact and what is fiction.
At least in the Western world I therefore group the major
belief systems into two opposite stereotypes:
- Faith-focused: Believing in God, nationalistic, supporting the military, less formal academic education.
- Science-focused: Believing in science, political correctness, with advanced academic (college) education.
In the US electorate, there is high overlap between the
faith-focused segment and Republicans, while the science-focused demographics
are more leaning Democrat. As conservative radio show host Rush Limbaugh said “…fake news is the everyday news”. According to Limbaugh,… mainstream media “… they just make it up.”
Tracing the Source of Rumors – Turning it into Fake or Fact
To make up one’s own mind about a new rumor, it is therefore
extremely helpful to see who is supporting a particular claim, and find out
where it originates. For example, article talk pages on Wikipedia article are
an excellent starting point for drilling down on fake news. For instance, this fake news about the Berggruen Institute - a perfectly legitimate institution - right
on the Wikipedia talk page of the Institute claims that the Berggruen institute is a “shill for US intelligence/related
functions”.
The following example using Condor Coolhunting illustrates
how to find the influencers behind a rumor, in this example about “fake news”
itself, and shows how to identify their belief system:
To gain a quick overview of the most
influential people tweeting about “fake news” in the sense of Rush Limbaugh, I
collected 18,000 tweets on December 27, 2016 with the hashtag #fake2016facts.
The picture below shows the retweet network. Note the connected component in
the core, with just three people being highly central, and the “asteroid belt”
in the periphery of the people whose tweets are being ignored and going into
the void.
When running Condor’s influence determination algorithm,
which looks at who injects new words into the discussion first, and how quickly
these words are picked up by others, we find that the most influential people
are not the same as identified in the previous picture. Rather a new group of
influencers emerges, which is also part of the connected component in the
center, but somewhat more peripheral in the network. Their tweets are picked up
by more prominent and popular bloggers, who then spread them in the rest of the
twittersphere.
Looking at the content of the tweets about fake2016facts, we
find that the tweeters like Trump, Obama, and Jesus (shown in green), and loathe
Hillary Clinton, election, Russia, Russians, and (some) Americans (shown in red), but not America. Black words
are neutral.
Next I analyzed the contents of the self-description of the
people tweeting about fake2016facts. Words like Trump, America, Christian, God,
Family, and Mom appear in a positive context (shown in green), while words like
conservative, politics, and lists are also popular, but used in a negative
(shown in red) context.
To resume, it seems that tweeters about #Fake2016Facts – showing a high distrust of mainstream media - are predominantly part of the faith-based belief system.
GalaxyScope - Our Web Tool to Find Influencers
We have created an early prototype of a tool that allows everybody to enter a few keywords describing a “fake news candidate”, and see who has been speaking about it on Twitter, where it was mentioned on Wikipedia, and on which blogs and Websites it prominently appears. The screen dump below shows the search results for “pizzagate”
Green nodes are Wikipedia pages, orange nodes are Twitter
users, and blue nodes are Web sites and people mentioned on Blogs and Web
sites.
The picture below shows another fake news candidate, looking
at the social media network emerging from the search for “DNC hack”, the
suspected break in of the Russian secret service into the e-mail server of the
Democratic National Committee right before the 2016 US Presidential Elections.
You can try it out for yourself by visiting
“scope.galaxyadvisors.com” and clicking on “people scope”. Let me know when you find some interesting fake news networks.
Excellent use of mapping. It is also possible to use a similar approach to mapping the knowledge itself. Here for example, the relatively disconnected ideas in Trump's economic plan show that the plan is unlikely to succeed: https://kumu.io/Steve/scoring-the-score-trumps-economic-policy#scoring-trumps-economic-policy
ReplyDelete