日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細

  Deriving a Web-scale Common Sense Fact Knowledge Base

Tandon, N. (2011). Deriving a Web-scale Common Sense Fact Knowledge Base. Master Thesis, Universität des Saarlandes, Saarbrücken.

Item is

基本情報

表示: 非表示:
資料種別: 学位論文

ファイル

表示: ファイル
非表示: ファイル
:
2011_Niket_Tandon_Thesis.pdf (全文テキスト(全般)), 809KB
 
ファイルのパーマリンク:
-
ファイル名:
2011_Niket_Tandon_Thesis.pdf
説明:
-
OA-Status:
閲覧制限:
制限付き (Max Planck Institute for Informatics, MSIN; )
MIMEタイプ / チェックサム:
application/pdf
技術的なメタデータ:
著作権日付:
-
著作権情報:
-
CCライセンス:
-

関連URL

表示:

作成者

表示:
非表示:
 作成者:
Tandon, Niket1, 2, 著者           
Weikum, Gerhard1, 学位論文主査           
Theobalt, Christian3, 監修者                 
所属:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, ou_1116551              
3Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047              

内容説明

表示:
非表示:
キーワード: -
 要旨: The fact that birds have feathers and ice is cold seems trivially true. Yet, most machine-readable sources of knowledge either lack such common sense facts entirely or have only limited coverage. Prior work on automated knowledge base construction has largely focused on relations between named entities and on taxonomic knowledge, while disregarding common sense properties. Extracting such structured data from text is challenging, especially due to the scarcity of explicitly expressed knowledge. Even when relying on large document collections, patternbased information extraction approaches typically discover insufficient amounts of information. This thesis investigates harvesting massive amounts of common sense knowledge using the textual knowledge of the entire Web, yet staying away from the massive engineering efforts in procuring such a large corpus as a Web. Despite the advancements in knowledge harvesting, we observed that the state of the art methods were limited in terms of accuracy and discovered insufficient amounts of information under our desired setting. This thesis shows how to gather large amounts of common sense facts from Web N-gram data, using seeds from the existing knowledge bases like ConceptNet. Our novel contributions include scalable methods for tapping onto Web-scale data and a new scoring model to determine which patterns and facts are most reliable. An extensive experimental evaluation is provided for three different binary relations, comparing different sources of n-gram data as well as different algorithms. The experimental results show that this approach extends ConceptNet by many orders of magnitude (more than 200-fold) at comparable levels of precision.

資料詳細

表示:
非表示:
言語: eng - English
 日付: 2011-082011
 出版の状態: 出版
 ページ: X, 81 p.
 出版情報: Saarbrücken : Universität des Saarlandes
 目次: -
 査読: -
 識別子(DOI, ISBNなど): BibTex参照ID: MasterTandon2011
 学位: 修士号 (Master)

関連イベント

表示:

訴訟

表示:

Project information

表示:

出版物

表示: