English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Token merging in language model-based confusible disambiguation

Stehouwer, H., & Van Zaanen, M. (2009). Token merging in language model-based confusible disambiguation. In T. Calders, K. Tuyls, & M. Pechenizkiy (Eds.), Proceedings of the 21st Benelux Conference on Artificial Intelligence (pp. 241-248).

Item is

Files

show Files
hide Files
:
bnaic2009_paper_76.pdf (Any fulltext), 103KB
Name:
bnaic2009_paper_76.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Stehouwer, Herman1, Author           
Van Zaanen, Menno1, Author
Affiliations:
1Tilburg University, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: In the context of confusible disambiguation (spelling correction that requires context), the synchronous back-off strategy combined with traditional n-gram language models performs well. However, when alternatives consist of a different number of tokens, this classification technique cannot be applied directly, because the computation of the probabilities is skewed. Previous work already showed that probabilities based on different order n-grams should not be compared directly. In this article, we propose new probability metrics in which the size of the n is varied according to the number of tokens of the confusible alternative. This requires access to n-grams of variable length. Results show that the synchronous back-off method is extremely robust. We discuss the use of suffix trees as a technique to store variable length n-gram information efficiently.

Details

show
hide
Language(s): eng - English
 Dates: 200920092009
 Publication Status: Issued
 Pages: 6
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: -
 Degree: -

Event

show
hide
Title: BNAIC 2009. Benelux Conference on Artificial Intelligence
Place of Event: Einhoven
Start-/End Date: -

Legal Case

show

Project information

show

Source 1

show
hide
Title: Proceedings of the 21st Benelux Conference on Artificial Intelligence
Source Genre: Proceedings
 Creator(s):
Calders, Toon, Editor
Tuyls, Karl, Editor
Pechenizkiy, Mykola , Editor
Affiliations:
-
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 241 - 248 Identifier: -