High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Classification of Text Document based on Headword Extraction Algorithm using Text Mining

Author(s):

Kavya jain , Shri Sankracharya Technical Campus; Dr. Siddhart choubey, Shri Sankracharya Technical Campus

Keywords:

Text Mining, Noun Entity Recognizer, 3 – Class Classifier, 4 – Class Classifier, 7 – Class Classifier

Abstract

Text mining is a practice that is utilized to find advantageous information from large amount of data sets. Data mining has guidelines known as frequent pattern and association rule that is essential for finding frequent patterns. mining the semantic information from free text document provides the enabling technology for a host to identify the class to which the text document belongs. The NER (Noun Entity Region ) has been used to identify the noun keywords using classifier uniquely viz 3 – Class classifier, 4 – class classifier and 7 – class classifier. By using the concept of Parsing and NER, the text document has been classified to the predefined class to which the given text document belongs by merging the MP – I ( List of noun ) and MP – II ( List of verb ) and matched it with the stored headwords to conclude which class the document belongs to. The use of parser to convert the text document to parse tree increased the accuracy of the work.

Other Details

Paper ID: IJSRDV4I40835
Published in: Volume : 4, Issue : 4
Publication Date: 01/07/2016
Page(s): 1013-1017

Article Preview

Download Article