A two-pass strategy for handling OOVs in a large vocabulary recognition task

Scharenborg, Odette; Seneff, S.

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

会議論文

A two-pass strategy for handling OOVs in a large vocabulary recognition task

MPS-Authors

There are no MPG-Authors in the publication available

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

12116927d01.pdf
(出版社版), 120KB

付随資料 (公開)

There is no public supplementary material available

引用

Scharenborg, O., & Seneff, S. (2005). A two-pass strategy for handling OOVs in a large vocabulary recognition task. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, (pp. 1669-1672). ISCA Archive.

引用: https://hdl.handle.net/11858/00-001M-0000-0012-D235-F

要旨

This paper addresses the issue of large-vocabulary recognition in a specific word class. We propose a two-pass strategy in which only major cities are explicitly represented in the first stage lexicon. An unknown word model encoded as a phone loop is used to detect OOV city names (referred to as rare city names). After which SpeM, a tool that can extract words and word-initial cohorts from phone graphs on the basis of a large fallback lexicon, provides an N-best list of promising city names on the basis of the phone sequences generated in the first stage. This N-best list is then inserted into the second stage lexicon for a subsequent recognition pass. Experiments were conducted on a set of spontaneous telephone-quality utterances each containing one rare city name. We tested the size of the N-best list and three types of language models (LMs). The experiments showed that SpeM was able to include nearly 85% of the correct city names into an N-best list of 3000 city names when a unigram LM, which also boosted the unigram scores of a city name in a given state, was used.