English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Data Science Methods for the Analysis of Controversial Social Media Discussions

Guimarães, A. (2022). Data Science Methods for the Analysis of Controversial Social Media Discussions. PhD Thesis, Universität des Saarlandes, Saarbrücken. doi:10.22028/D291-36502.

Item is

Files

show Files

Locators

show
hide
Description:
-
OA-Status:
Green

Creators

show
hide
 Creators:
Guimarães, Anna1, 2, Author           
Weikum, Gerhard1, Advisor           
de Melo, Gerard1, Referee           
Yates, Andrew1, Referee           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              

Content

show
hide
Free keywords: -
 Abstract: Social media communities like Reddit and Twitter allow users to express their views on
topics of their interest, and to engage with other users who may share or oppose these views.
This can lead to productive discussions towards a consensus, or to contended debates, where
disagreements frequently arise.
Prior work on such settings has primarily focused on identifying notable instances of antisocial
behavior such as hate-speech and “trolling”, which represent possible threats to the health of
a community. These, however, are exceptionally severe phenomena, and do not encompass
controversies stemming from user debates, differences of opinions, and off-topic content, all
of which can naturally come up in a discussion without going so far as to compromise its
development.
This dissertation proposes a framework for the systematic analysis of social media discussions
that take place in the presence of controversial themes, disagreements, and mixed opinions from
participating users. For this, we develop a feature-based model to describe key elements of a
discussion, such as its salient topics, the level of activity from users, the sentiments it expresses,
and the user feedback it receives.
Initially, we build our feature model to characterize adversarial discussions surrounding
political campaigns on Twitter, with a focus on the factual and sentimental nature of their
topics and the role played by different users involved. We then extend our approach to Reddit
discussions, leveraging community feedback signals to define a new notion of controversy
and to highlight conversational archetypes that arise from frequent and interesting interaction
patterns. We use our feature model to build logistic regression classifiers that can predict future
instances of controversy in Reddit communities centered on politics, world news, sports, and
personal relationships. Finally, our model also provides the basis for a comparison of different
communities in the health domain, where topics and activity vary considerably despite their
shared overall focus. In each of these cases, our framework provides insight into how user
behavior can shape a community’s individual definition of controversy and its overall identity.

Details

show
hide
Language(s): eng - English
 Dates: 2022-06-1020222022
 Publication Status: Issued
 Pages: 94 p.
 Publishing info: Saarbrücken : Universität des Saarlandes
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: Decarvalhophd2021
DOI: 10.22028/D291-36502
URN: nbn:de:bsz:291--ds-365021
Other: hdl:20.500.11880/33161
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show