Data Science Methods for the Analysis of Controversial Social Media Discussions

Guimarães, Anna

doi:10.22028/D291-36502

Local TagsRelease HistoryDetailsSummary

Data Science Methods for the Analysis of Controversial Social Media Discussions

Guimarães, A. (2022). Data Science Methods for the Analysis of Controversial Social Media Discussions. PhD Thesis, Universität des Saarlandes, Saarbrücken. doi:10.22028/D291-36502.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-000A-CDF7-9 Version Permalink: https://hdl.handle.net/21.11116/0000-000C-6E64-9

Genre: Thesis

Files

show Files

Locators

show

hide

Locator:
https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33161 (Any fulltext) Open Access Green

Description:
-

OA-Status:
Green

Creators

show

hide

Creators:
Guimarães, Anna^{1, 2}, Author
Weikum, Gerhard¹, Advisor
de Melo, Gerard¹, Referee
Yates, Andrew¹, Referee

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551

Content

show

hide

Free keywords: -

Abstract: Social media communities like Reddit and Twitter allow users to express their views on
topics of their interest, and to engage with other users who may share or oppose these views.
This can lead to productive discussions towards a consensus, or to contended debates, where
disagreements frequently arise.
Prior work on such settings has primarily focused on identifying notable instances of antisocial
behavior such as hate-speech and “trolling”, which represent possible threats to the health of
a community. These, however, are exceptionally severe phenomena, and do not encompass
controversies stemming from user debates, differences of opinions, and off-topic content, all
of which can naturally come up in a discussion without going so far as to compromise its
development.
This dissertation proposes a framework for the systematic analysis of social media discussions
that take place in the presence of controversial themes, disagreements, and mixed opinions from
participating users. For this, we develop a feature-based model to describe key elements of a
discussion, such as its salient topics, the level of activity from users, the sentiments it expresses,
and the user feedback it receives.
Initially, we build our feature model to characterize adversarial discussions surrounding
political campaigns on Twitter, with a focus on the factual and sentimental nature of their
topics and the role played by different users involved. We then extend our approach to Reddit
discussions, leveraging community feedback signals to define a new notion of controversy
and to highlight conversational archetypes that arise from frequent and interesting interaction
patterns. We use our feature model to build logistic regression classifiers that can predict future
instances of controversy in Reddit communities centered on politics, world news, sports, and
personal relationships. Finally, our model also provides the basis for a comparison of different
communities in the health domain, where topics and activity vary considerably despite their
shared overall focus. In each of these cases, our framework provides insight into how user
behavior can shape a community’s individual definition of controversy and its overall identity.

Details

show

hide

Language(s): eng - English

Dates: Accepted: 2022-06-10Published Online: 2022Date issued: 2022

Publication Status: Issued

Pages: 94 p.

Publishing info: Saarbrücken : Universität des Saarlandes

Table of Contents: -

Rev. Type: -

Identifiers: BibTex Citekey: Decarvalhophd2021
DOI: 10.22028/D291-36502
URN: nbn:de:bsz:291--ds-365021
Other: hdl:20.500.11880/33161

Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show