Efficiently Using Prime-Encoding for Mining Frequent Itemsets in Sparse Data

Authors

  • Karam Gouda Faculty of Computers and Informatics, Benha University
  • Mosab Hassaan Faculty of Computers and Informatics, Benha University

Keywords:

Mining frequent itemsets, prime-block encoding, sparse data

Abstract

In the data mining field, data representation turns out to be one of the major factors affecting mining algorithm scalability. Mining Frequent Itemsets (MFI) is a data mining problem that is heavily affected by this fact. The vertical approach is one of the successful data representations adopted for MFI problem. The main advantage of this approach is support for fast frequency counting via joining operations. Recently, an encoding method called prime-encoding is proposed as an enhancement for the vertical approach [10]. The performance study introduced in [10] confirmed the high quality of prime-encoding based vertical mining of frequent sequence over other vertical and horizontal ones in terms of space and time. Though sequence mining is more general than itemset mining, this paper presents a prime-encoding based vertical mining of frequent itemsets with new optimizations and a new re-encoding method that further enhance memory and speed. The experimental results show that prime encoding based vertical itemset mining is suitable for high-dimensional sparse data.

Downloads

Download data is not yet available.

Downloads

Published

2014-01-20

How to Cite

Gouda, K., & Hassaan, M. (2014). Efficiently Using Prime-Encoding for Mining Frequent Itemsets in Sparse Data. Computing and Informatics, 32(5), 1079–1099. Retrieved from http://147.213.75.17/ojs/index.php/cai/article/view/1985