日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細


公開

学術論文

Pitfalls in corpus research

MPS-Authors

Ernestus,  Mirjam
Language Comprehension Group, MPI for Psycholinguistics, Max Planck Society;
Center for Language Studies, external;
Decoding Continuous Speech, MPI for Psycholinguistics, Max Planck Society;

External Resource
There are no locators available
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
フルテキスト (公開)

Rietveld_2004_pitfalls.pdf
(出版社版), 166KB

付随資料 (公開)
There is no public supplementary material available
引用

Rietveld, T., Van Hout, R., & Ernestus, M. (2004). Pitfalls in corpus research. Computers and the Humanities, 38(4), 343-362. doi:10.1007/s10579-004-1919-1.


引用: https://hdl.handle.net/11858/00-001M-0000-0013-1762-B
要旨
This paper discusses some pitfalls in corpus research and suggests solutions on the basis of examples and computer simulations. We first address reliability problems in language transcriptions, agreement between transcribers, and how disagreements can be dealt with. We then show that the frequencies of occurrence obtained from a corpus cannot always be analyzed with the traditional X2 test, as corpus data are often not sequentially independent and unit independent. Next, we stress the relevance of the power of statistical tests, and the sizes of statistically significant effects. Finally, we point out that a t-test based on log odds often provides a better alternative to a X2 analysis based on frequency counts.