English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  The Sociolinguistic speech corpus of Chilean Spanish (COSCACH): a socially stratified text, audio and video corpus with multiple speech styles

Sadowsky, S. (2022). The Sociolinguistic speech corpus of Chilean Spanish (COSCACH): a socially stratified text, audio and video corpus with multiple speech styles. International journal of corpus linguistics, 27(1): 19103.sad, pp. 93-125. doi:10.1075/ijcl.19103.sad.

Item is

Files

show Files
hide Files
:
shh3141.pdf (Publisher version), 683KB
 
File Permalink:
-
Name:
shh3141.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Sadowsky, Scott1, Author           
Affiliations:
1Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society, ou_2074311              

Content

show
hide
Free keywords: Chilean Spanish ; phonetics ; corpus design and construction ; sociolinguistics ; speech corpora
 Abstract: This paper presents the Sociolinguistic Speech Corpus of Chilean Spanish (COSCACH) v1.0, a 9.3-million-word corpus containing transcribed, lemmatized and morphologically tagged text, audio recordings and videos from 1,237 L1 speakers of Chilean Spanish, as well as a control sample of 21 non-Chilean L1 Spanish speakers. The COSCACH is the first freely available corpus of spoken Chilean Spanish of substantial size, as well as one of the largest speech corpora of any variety of Spanish. Following a review of other Chilean speech corpora, I describe how the COSCACH was constructed, covering corpus design, speaker recruitment and metadata collection, speech elicitation and recording, transcription, lemmatization and morphological tagging, and corpus compilation. I thereby aim to provide a blueprint for creating modern, large-scale speech corpora suitable for phonetic, sociophonetic and sociolinguistic research, in addition to traditional inquiry into semantics, lexis, grammar, pragmatics and discourse.

Details

show
hide
Language(s): eng - English
 Dates: 2022-01-312022-03
 Publication Status: Issued
 Pages: 33
 Publishing info: -
 Table of Contents: 1. Introduction
2. Other Chilean Spanish speech corpora
2.1 ESECH and PRESEEA-SA
2.2 King-ASR-290
2.3 Additional speech corpora
2.4 Justification for the COSCACH
3. Corpus design and speaker sampling
3.1 Chilean speaker samples
3.1.1 Speaker inclusion variables
A. Locality
B. Ethnicity
C. Lingualism
D. Age/Generation
E. Sex
F. Socioeconomic status
G. Year of recording
3.1.2 Derived variables
A. Region
B. Urbanness
C. Locality size
D. Distance and travel time from Santiago
3.2 Non-Chilean control sample
4. Data collection
4.1 Timeframe
4.2 Fieldworkers
4.3 Speaker recruitment
4.3.1 Recruitment procedures
4.3.2 Informed consent and institutional review board (IRB) approval
4.3.3 Further criteria for exclusion of speakers
4.4 Socio-demographic questionnaire
4.5 Elicitation tasks
4.5.1 Sustained pronunciation of isolated vowels
4.5.2 Reading of minimal pairs or other word lists
4.5.3 Reading of meaningful sentences
4.5.4 Reading of meaningful texts
4.5.5 Conversational interview
4.5.6 Language attitudes interview
4.6 Recording
4.6.1 Audio equipment and configuration
4.6.2 Audio post-processing
4.6.3 Video recording equipment and procedures
5. Transcription, text processing and corpus compilation
5.1 Transcription
5.2 Anonymization and protection of speakers’ privacy
5.3 Text extraction and annotation
5.4 Corpus compilation and use
6. Availability and access
7. Conclusions and future directions
 Rev. Type: Peer
 Identifiers: DOI: 10.1075/ijcl.19103.sad
Other: shh3141
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: International journal of corpus linguistics
  Abbreviation : IJCL
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Amsterdam [u.a.] : Benjamins
Pages: - Volume / Issue: 27 (1) Sequence Number: 19103.sad Start / End Page: 93 - 125 Identifier: ISSN: 1569-9811
CoNE: https://pure.mpg.de/cone/journals/resource/1569-9811