High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Comparison of Fuzzy C-Means and Hierarchical Agglomerative Clustering Algorithms for Data Mining

Author(s):

Jyoti Patel , Chhatrapati Shivaji Institute of Technology Durg, Chhattisgarh; Om Prakash Yadav, Chhatrapati Shivaji Institute of Technology Durg, Chhattisgarh

Keywords:

Text Mining, Natural Language Processing, Fuzzy C-Mean, Term-Frequency, Inverse Document Frequency, TF-IDF

Abstract

Nowadays most of information is available in electronic format and the web pages contain a gigantic amount of information present in unstructured format and semi-structured format like newspaper, stories, email message, books, blogs etc. which can be transformed and extracted to usable information as per our requirements. This paper focuses to extracting and mining the useful or important information from the text corpus. This paper uses the Reuter Data from the Reuter Data set. The main problem in text mining is that the data in text form is written using grammatical rules to make it readable by humans, so to be able to analyze the text; it first needs to be preprocessed. The algorithm may be used to find the similarity between Reuter Data and to create the cluster is Fuzzy C-Mean and Hierarchical agglomerative clustering algorithms.

Other Details

Paper ID: IJSRDV4I31359
Published in: Volume : 4, Issue : 3
Publication Date: 01/06/2016
Page(s): 1514-1516

Article Preview

Download Article