Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Empirical Evaluation of Common Assumptions in Building Political Bias Datasets


Ganguly,  Soumen
International Max Planck Research School, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Ganguly, S. (2019). Empirical Evaluation of Common Assumptions in Building Political Bias Datasets. Master Thesis, Universität des Saarlandes, Saarbrücken.

Cite as: https://hdl.handle.net/21.11116/0000-0002-B37F-6
In today’s world, bias and polarization are some of the biggest problems plaguing our
society. In such volatile environments, news media play a crucial role as the gatekeepers
of the information. Given the huge impact they can have on societal evolution, they have
long been studied by researchers. Researchers and practitioners often build political bias
datasets for a variety of tasks ranging from examining bias of news outlets and articles
to studying and designing algorithmic news retrieval systems for online platforms. Often,
researchers make certain simplifying assumptions in building such datasets.

In this thesis, we empirically validate three such common assumptions given the im-
portance of such datasets. The three assumptions are, (i) raters’ political leaning does
not affect their ratings of political articles, (ii) news articles follow the leaning of their
source outlet, and (iii) political leaning of a news outlet is stable across reporting on
different topics. We constructed a manually annotated ground-truth dataset of news
articles, published by several popular news media outlets in the U.S., on “Gun policy”
and “Immigration” along with their political bias leanings using Amazon Mechanical
Turk and used it to validate these assumptions.

Our findings suggest that, (i) in certain cases, liberal and conservative raters’ label
leanings of news articles differently, (ii) in many cases, the news articles do not follow the
political leaning of their source outlet, and (iii) for certain outlets, the political leaning
of the outlet does not remain unchanged while reporting on different topics/issues. We
believe, our work offers important guidelines for future attempts at building political
bias datasets which in turn will help them in building better algorithmic news retrieval
systems for online platforms.