A New Database for Italian Parliamentary Speeches: Introducing the 
ItaParlCorpus Dataset

Cova, Joshua

doi:10.1017/ipo.2025.6

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

A New Database for Italian Parliamentary Speeches: Introducing the ItaParlCorpus Dataset

MPS-Authors

/persons/resource/persons292144

Cova, Joshua
Politische Ökonomie, MPI for the Study of Societies, Max Planck Society;

External Resource

https://doi.org/10.1017/ipo.2025.6
(Publisher version)

https://doi.org/10.7910/DVN/KUARWD
(Research data)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

IPSR_2025_Cova.pdf
(Any fulltext), 526KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Cova, J. (2025). A New Database for Italian Parliamentary Speeches: Introducing the ItaParlCorpus Dataset. Italian Political Science Review. doi:10.1017/ipo.2025.6.

Cite as: https://hdl.handle.net/21.11116/0000-0010-E43C-A

Abstract

A common challenge in studying Italian parliamentary discourse is the lack of accessible, machine-readable, and systematized parliamentary data. To address this, this article introduces the ItaParlCorpus dataset, a new, annotated, machine-readable collection of Italian parliamentary plenary speeches for the Camera dei Deputati, the lower house of Parliament, spanning from 1948 to 2022. This dataset encompasses 470 million words and 2.4 million speeches delivered by 5830 unique speakers representing 77 different political parties. The files are designed for easy processing and analysis using widely-used programming languages, and they include metadata such as speaker identification and party affiliation. This opens up opportunities for in-depth analyses on a variety of topics related to parliamentary behavior, elite rhetoric, and the salience of political themes, exploring how these vary across party families and over time.