English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning

Sizov, S., Siersdorfer, S., & Weikum, G. (2004). Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning. In CIKM 2004: proceedings of the Thirteenth Conference on Information and Knowledge Management (pp. 59-68). New York, USA: ACM.

Item is

Files

show Files
hide Files
:
cikm04-275-sizov.pdf (Any fulltext), 202KB
 
File Permalink:
-
Name:
cikm04-275-sizov.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Sizov, Sergej1, Author           
Siersdorfer, Stefan1, Author           
Weikum, Gerhard1, Author           
Evans, David A., Editor
Gravano, Luis, Editor
Herzog, Otthein, Editor
Zhai, ChengXiang, Editor
Ronthaler, Marc, Editor
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              

Content

show
hide
Free keywords: -
 Abstract: Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given application context these parameters should be set so as to meet the relative importance of various result quality metrics such as precision versus recall. In this paper we consider classifiers that can accept a document for a topic, reject it, or abstain. We aim to meet the application's goals in terms of accuracy (i.e., avoid false acceptances or rejections) and loss (i.e., limit the fraction of documents for which no decision is made). To this end we investigate restrictive forms of Support Vector Machine classifiers and we develop meta methods that split the training data into subsets for independently trained classifiers and then combine the results of these classifiers. These techniques tend to improve accuracy at the expense of document loss. We develop estimators that help to predict the accuracy and loss for a given setting of the methods' tuning parameters, and a methodology for efficiently deriving a setting that meets the application's goals. Our experiments confirm the practical viability of the approach.

Details

show
hide
Language(s): eng - English
 Dates: 2005-05-312004
 Publication Status: Issued
 Pages: -
 Publishing info: New York, USA : ACM
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 231864
Other: Local-ID: C1256DBF005F876D-EE103EB2A21109DCC1256F9500333F20-SizovClass2004
 Degree: -

Event

show
hide
Title: Untitled Event
Place of Event: Washington D.C., USA
Start-/End Date: 2004-11-08

Legal Case

show

Project information

show

Source 1

show
hide
Title: CIKM 2004 : proceedings of the Thirteenth Conference on Information and Knowledge Management
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: New York, USA : ACM
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 59 - 68 Identifier: -