High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

A Survey on Document Categorization Based on Keyword Extraction using Various Algorithms

Author(s):

Prerna Madaan , KURUKSHETRA INSTITUTE OF TECHNOLOGY AND MANAGEMENT; Gurmeet Singh, KURUKSHETRA INSTITUTE OF TECHNOLOGY AND MANAGEMENT

Keywords:

Text Mining, Naïve Bayes, Support Vector Machines (SVM), TF-IDF, k-Nearest Neighbour (k-NN), Vector Space Model, Weight Adjusted k-Nearest Neighbour (WAKNN)

Abstract

Text classification is the process of classifying the text documents based on words, phrases and word combination with respect to set of predefined categories. Text classification has many applications such as mail routing, email filtering, news classification etc. and the various institutions and industries are converting their documents into electronic text files. Keyword Nets using TF-IDFs .The words which have highest similarity or frequency are taken as keywords. In this survey paper we described various algorithms for document categorization.

Other Details

Paper ID: IJSRDV3I30979
Published in: Volume : 3, Issue : 3
Publication Date: 01/06/2015
Page(s): 2569-2573

Article Preview

Download Article