Real World Document Clustering Using Modified Balanced Iterative Reducing and Clustering using Hierarchies |
Author(s): |
| Sunita N. Chaudhari , Truba College of Engineering & Technology; Prof. Praveen Kumar Gautam, Truba College of Engineering & Technology |
Keywords: |
| Frequent Pattern Mining, High Utility Itemset Mining, Transaction Database |
Abstract |
|
clustering is “the process of organizing objects into groups whose members are similar in some wayâ€. A cluster is therefore a collection of objects which are coherent internally, but clearly dissimilar to the objects belonging to other clusters.Document clustering is used in many fields such as data mining and information retrieval.to compare the clustering results of K-Mean approach ,agglomerative approach , partitioned approach for each of the criterion functionsusing real-world documents, and to establish theright clustering algorithm to produce high quality clustering ofreal-world document. The goal of a document clustering method is to reduce intra-cluster distances between documents, while exploiting inter-cluster distances (using an appropriate distance measure between documents). A distance measure (or, dually, similarity measure) thus lies at the heart of document clustering. The large variety of documents makes it almost unfeasible to create a general algorithm which can work best in case of all kinds of datasets. |
Other Details |
|
Paper ID: IJSRDV3I120343 Published in: Volume : 3, Issue : 12 Publication Date: 01/03/2016 Page(s): 380-383 |
Article Preview |
|
|
|
|
