Using suffix arrays as language models: Scaling the n-gram

Stehouwer, Herman; van Zaanen, Menno

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Using suffix arrays as language models: Scaling the n-gram

MPS-Authors

There are no MPG-Authors in the publication available

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

bnaic10.pdf
(Any fulltext), 170KB

Supplementary Material (public)

There is no public supplementary material available

Citation

Stehouwer, H., & van Zaanen, M. (2010). Using suffix arrays as language models: Scaling the n-gram. In Proceedings of the 22st Benelux Conference on Artificial Intelligence (BNAIC 2010), October 25-26, 2010.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0012-3E79-E

Abstract

In this article, we propose the use of sufﬁx arrays to implement n-gram language models with practically unlimited size n. These unbounded n-grams are called 1-grams. This approach allows us to use large contexts efﬁciently to distinguish between different alternative sequences while applying synchronous back-off. From a practical point of view, the approach has been applied within the context of spelling confusibles, verb and noun agreement and prenominal adjective ordering. These initial experiments show promising results and we relate the performance to the size of the n-grams used for disambiguation.