Classification of Text Document based on Headword Extraction Algorithm using Text Mining |
Author(s): |
Kavya jain , Shri Sankracharya Technical Campus; Dr. Siddhart choubey, Shri Sankracharya Technical Campus |
Keywords: |
Text Mining, Noun Entity Recognizer, 3 – Class Classifier, 4 – Class Classifier, 7 – Class Classifier |
Abstract |
Text mining is a practice that is utilized to find advantageous information from large amount of data sets. Data mining has guidelines known as frequent pattern and association rule that is essential for finding frequent patterns. mining the semantic information from free text document provides the enabling technology for a host to identify the class to which the text document belongs. The NER (Noun Entity Region ) has been used to identify the noun keywords using classifier uniquely viz 3 – Class classifier, 4 – class classifier and 7 – class classifier. By using the concept of Parsing and NER, the text document has been classified to the predefined class to which the given text document belongs by merging the MP – I ( List of noun ) and MP – II ( List of verb ) and matched it with the stored headwords to conclude which class the document belongs to. The use of parser to convert the text document to parse tree increased the accuracy of the work. |
Other Details |
Paper ID: IJSRDV4I40835 Published in: Volume : 4, Issue : 4 Publication Date: 01/07/2016 Page(s): 1013-1017 |
Article Preview |
|
|