A Survey on Document Categorization Based on Keyword Extraction using Various Algorithms |
Author(s): |
| Prerna Madaan , KURUKSHETRA INSTITUTE OF TECHNOLOGY AND MANAGEMENT; Gurmeet Singh, KURUKSHETRA INSTITUTE OF TECHNOLOGY AND MANAGEMENT |
Keywords: |
| Text Mining, Naïve Bayes, Support Vector Machines (SVM), TF-IDF, k-Nearest Neighbour (k-NN), Vector Space Model, Weight Adjusted k-Nearest Neighbour (WAKNN) |
Abstract |
|
Text classification is the process of classifying the text documents based on words, phrases and word combination with respect to set of predefined categories. Text classification has many applications such as mail routing, email filtering, news classification etc. and the various institutions and industries are converting their documents into electronic text files. Keyword Nets using TF-IDFs .The words which have highest similarity or frequency are taken as keywords. In this survey paper we described various algorithms for document categorization. |
Other Details |
|
Paper ID: IJSRDV3I30979 Published in: Volume : 3, Issue : 3 Publication Date: 01/06/2015 Page(s): 2569-2573 |
Article Preview |
|
|
|
|
