Quantifying cue trading in word decoding tasks

Scharenborg, Odette; Ten Bosch, Louis

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Poster

Quantifying cue trading in word decoding tasks

MPS-Authors

/persons/resource/persons4642

Scharenborg, Odette
Adaptive Listening, MPI for Psycholinguistics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Scharenborg, O., & Ten Bosch, L. (2011). Quantifying cue trading in word decoding tasks. Poster presented at The 17th Annual Conference on Architectures and Mechanisms for Language Processing [AMLaP 2011], Paris, France.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0012-1965-B

Abstract

Listeners can make use of multiple acoustic cues for each phonological contrast. It is well known that the absence of some cues may be compensated by the presence of other cues. In this paper, we investigate cue trading in the broader context of speech processing by using a computational model of human word processing (cf. Werker & Curtin, 2005). Cue trading has been considered an explanatory mechanism for phoneme perception, see e.g., the Fuzzy Logical Model of Perception (FLMP; Massaro & Oden, 1980) and normal a posteriori probability (NAPP) models (Nearey, 1997). Both NAPP and FLMP deal with probabilistic phone classification and treat cue weighting as a category-dependent process. This, however, leaves open the question to what extent cue trading plays a role in the context of word or speech processing – which is a broader context than speech sound categorization which has been the more conventional context in which cue trading has been studied. The here presented approach allows a precise quantification of the amount of cue trading as observed during speech decoding on a speech corpus. Cue trading must be learned. It therefore makes sense to seek for mechanisms that explain cue integration and weighting as a result of an acquisition process. Toscano & McMurray (2010) show that cue-weighting provides a good fit to the perceptual data, but only when the weights emerged through the dynamics of learning. In line with Toscano & McMurray (2010), we address cue trading as a result of learning. We developed a method to quantify cue trading between articulatory features (AFs, e.g. Browman & Goldstein, 1992) as operational during a word decoding task. AFs describe the speech signal in terms of estimated values of, e.g., manner and place of articulation (see Table I). This representation allows more freedom in the description of the speech signal than the phoneme description. The model used is HMM-based. In this model, the phone models were conventionally defined as Hidden Markov Models and lexical items were defined in terms of sequences of phones. In contrast with conventional ASR training, however, the phone models were initiated (without training) by using canonical articulatory feature definitions according to table I. The HMM paradigm enables us to adapt these parameters during an actual decoding task, such that the resulting parameters can be interpreted as cue weights (cf. McMurray, Aslin, Toscano, 2009). The cue weights are directly interpretable as measures of sensitivity to changes in any of the features. This method relates to the way Clayards, Tanenhaus, Aslin, and Jacobs (2008) demonstrated (for a different task) that artificially manipulating the variance of an acoustic cue changes how listeners weight it perceptually. The model was applied on 2000 Dutch utterances from the database CAREGIVER (Altosaar et al., 2010). To that end, these utterances were represented as sequences of vectors with AFs. Figure 1 shows the found optimal phone-dependent cue weighting for each of the 33 features, in six situations: without any training and after each of in total 5 adaptations. Of all the AFs considered, manner and place are the most relevant ones (as shown by the higher weights in Figure 1) in terms of their contribution during word competition and word decoding. In summary, this model is able to find the cue trading within the AF representation by using actual speech, and in a psycholinguistically interpretable way. It will be used in an update of Fine-Tracker (Scharenborg, 2010).