非表示:
キーワード:
-
要旨:
Although Information Extraction field has been in the market since almost last
two decades, it is still considered to be in its initial stage. There are at
present many algorithms that are used for Information Extraction task, and they
also have a good success rate, but there are no benchemarks or standard data on
which they can be compared among themselves. Most of the algorithms work well
in semi-structured data but they seem to fail when dealing with free text.
Other algorithms fail when different types of data is required to extract from
the same doument. One way, is to some how try to compare them all and then try
to improve and create algorithms that are domain independent and work both
efficiently and effectively. For this, we introduce an idea of formalizing
Information Extractio algorithms and then to find out where can we improve them
or what parts are still needed to improve the over all performance. We have in
the end, described what we got by formalizing the algorithms and what can we
achieve after formalinzing further algorithms.