日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細

  Matrix Factorization over Diods and its Applications in Data Mining

Karaev, S. (2019). Matrix Factorization over Diods and its Applications in Data Mining. PhD Thesis, Universität des Saarlandes, Saarbrücken.

Item is

基本情報

表示: 非表示:
アイテムのパーマリンク: https://hdl.handle.net/21.11116/0000-0005-4369-A 版のパーマリンク: https://hdl.handle.net/21.11116/0000-000C-6F59-5
資料種別: 学位論文

ファイル

表示: ファイル

関連URL

表示:
非表示:
説明:
-
OA-Status:
Green

作成者

表示:
非表示:
 作成者:
Karaev, Sanjar1, 2, 著者           
Miettinen, Pauli1, 学位論文主査           
Weikum, Gerhard1, 監修者           
van Leeuwen, Matthijs3, 監修者
所属:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2International Max Planck Research School, MPI for Informatics, Max Planck Society, Campus E1 4, 66123 Saarbrücken, DE, ou_1116551              
3External Organizations, ou_persistent22              

内容説明

表示:
非表示:
キーワード: -
 要旨: Matrix factorizations are an important tool in data mining, and they have been used extensively for finding latent patterns in the data. They often allow to separate structure from noise, as well as to considerably reduce the dimensionality of the input matrix. While classical matrix decomposition methods, such as nonnegative matrix factorization (NMF) and singular value decomposition (SVD), proved to be very useful in data analysis, they are limited by the underlying algebraic structure. NMF, in particular, tends to break patterns into smaller bits, often mixing them with each other. This happens because overlapping patterns interfere with each other, making it harder to tell them apart. In this thesis we study matrix factorization over algebraic structures known as dioids, which are characterized by the lack of additive inverse (“negative numbers”) and the idempotency of addition (a + a = a). Using dioids makes it easier to separate overlapping features, and, in particular, it allows to better deal with the above mentioned pattern breaking problem. We consider different types of dioids, that range from continuous (subtropical and tropical algebras) to discrete (Boolean algebra). Among these, the Boolean algebra is perhaps the most well known, and there exist methods that allow one to obtain high quality Boolean matrix factorizations in terms of the reconstruction error. In this work, however, a different objective function is used – the description length of the data, which enables us to obtain compact and highly interpretable results. The tropical and subtropical algebras, on the other hand, are much less known in the data mining field. While they find applications in areas such as job scheduling and discrete event systems, they are virtually unknown in the context of data analysis. We will use them to obtain idempotent nonnegative factorizations that are similar to NMF, but are better at separating the most prominent features of the data.

資料詳細

表示:
非表示:
言語: eng - English
 日付: 2019-07-1020192019
 出版の状態: 出版
 ページ: 113 p.
 出版情報: Saarbrücken : Universität des Saarlandes
 目次: -
 査読: -
 識別子(DOI, ISBNなど): BibTex参照ID: Karaevphd2019
DOI: 10.22028/D291-28661
URN: urn:nbn:de:bsz:291--ds-286619
その他: hdl:20.500.11880/27903
 学位: 博士号 (PhD)

関連イベント

表示:

訴訟

表示:

Project information

表示:

出版物

表示: