User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse





Hybrid ASP-based Approach to Pattern Mining


Stepanova,  Daria
Databases and Information Systems, MPI for Informatics, Max Planck Society;


Miettinen,  Pauli
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Ressource
No external resources are shared
Fulltext (public)

(Preprint), 3MB

Supplementary Material (public)
There is no public supplementary material available

Paramonov, S., Stepanova, D., & Miettinen, P. (2018). Hybrid ASP-based Approach to Pattern Mining. Retrieved from http://arxiv.org/abs/1808.07302.

Cite as: http://hdl.handle.net/21.11116/0000-0002-5E60-9
Detecting small sets of relevant patterns from a given dataset is a central challenge in data mining. The relevance of a pattern is based on user-provided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like Answer Set Programming (ASP) seem well-suited for specifying such criteria in a form of constraints. Although progress has been made, on the one hand, on solving individual mining problems and, on the other hand, developing generic mining systems, the existing methods either focus on scalability or on generality. In this paper we make steps towards combining local (frequency, size, cost) and global (various condensed representations like maximal, closed, skyline) constraints in a generic and efficient way. We present a hybrid approach for itemset, sequence and graph mining which exploits dedicated highly optimized mining systems to detect frequent patterns and then filters the results using declarative ASP. To further demonstrate the generic nature of our hybrid framework we apply it to a problem of approximately tiling a database. Experiments on real-world datasets show the effectiveness of the proposed method and computational gains for itemset, sequence and graph mining, as well as approximate tiling. Under consideration in Theory and Practice of Logic Programming (TPLP).