Overcoming the Defects of K-Means Clustering by using Canopy Clustering Algorithm |
Author(s): |
| Ambika S , Sapthagiri College of Engineering; Kavitha G, Sapthagiri College of Engineering |
Keywords: |
| High Dimensional Dataset, Data Mining, Synthetic Sampling, Parameter Estimator, K-Means Clustering Algorithm, Canopy clustering Algorithm |
Abstract |
|
High dimension data clustering is the study of data that contains hundreds of dimensions. To improve the processing time of K-means clustering algorithm on high dimensional dataset by making use of canopy clustering algorithm. A canopy clustering algorithm uses the synthetic sampling method as the preprocessing step, as well as it uses the created T1 & T2 parameter values to create canopies and also provides initial cluster centers. Existing clustering algorithm normally works with the small dataset and it doesn’t works with the high dimensional dataset because the algorithm may yields the inaccurate clusters by selecting the random cluster centers, and another problem is the number of required cluster or k-values are predefined by the user. The proposed algorithm works well with the high dimensional dataset and it over comes the limitations of the K-means clustering algorithm and minimizes the execution time of the existing algorithm. |
Other Details |
|
Paper ID: IJSRDV4I50277 Published in: Volume : 4, Issue : 5 Publication Date: 01/08/2016 Page(s): 613-615 |
Article Preview |
|
|
|
|
