Performance Evaluation of Recurrent Set Mining Algorithms |
Author(s): |
| Ms.R.D.Priyanka , Info Institute of Engineering; Dr.R.Sabitha, Info Institute of Engineering; Ms.T.Mythili, Info Institute of Engineering |
Keywords: |
| Data Mining, Association Rule Mining, Frequent Itemsets, Diffset, Apriori, dEclat |
Abstract |
|
Data Mining (DM) is the process of extracting useful and non-trivial information from huge amounts of data. One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. Frequent itemsets play an important role in many data mining tasks. The task of discovering all frequent itemsets is a fundamental problem in data mining. In this paper we examine the problem of finding frequent itemsets using Apriori and dEclat algorithms on Mushroom Dataset and a comparative study is done for both the algorithms that use several optimizations to achieve maximum performance, with respect to execution time and frequent pattern generation. The Mushroom dataset contains characteristics of various species of mushrooms, and was originally obtained from the UCI Repository of Machine Learning Databases. Apriori and dEclat are the best-known algorithms for mining frequent itemsets in a set of transactions. Apriori is an influential algorithm for mining frequent itemsets for boolean association rules. The name of the algorithm is based on the fact that the algorithm uses prior knowledge of frequent itemset properties. The Apriori principle states that for an itemset to be frequent all its subsets have to be frequent. The Apriori implementation is based on a prefix tree representation of the needed counters and uses a doubly recursive scheme to count the transactions. The basic idea of dEclat is composed of computing diffsets for all distinct pairs of itemsets and checking the support of the itemsets. The dEclat implementation uses bit matrices to represent transactions lists and to filter closed and maximal itemsets. Our results show that dEclat outperforms Apriori both in terms of frequent generation and execution time for the mushroom dataset. |
Other Details |
|
Paper ID: IJSRDV4I20501 Published in: Volume : 4, Issue : 2 Publication Date: 01/05/2016 Page(s): 1914-1918 |
Article Preview |
|
|
|
|
