High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Text Based Fuzzy Clustering Algorithm to Filter Spam E-mail

Author(s):

Nakul Dave , VGEC, Chandkheda, Gujarat Technological University, Gujarat, India; Uttam Chauhan, VGEC, Chandkheda, Gujarat Technological University, Gujarat, India; Avani Dave, Kalol Institute Of Technology & Research Center, Gujarat Technological University, Gujarat, India

Keywords:

Spam, Vector Space Model, Fuzzy C-Means.

Abstract

Among the ample of approaches available for classification approach, majority of them are applicable for structured data, such as relational database, Online Transactional Data Processing or Online Analytical Data processing. But, today in the age of internet and its applications, very huge amount of data are being transmitted from one geographical location to another. The form of data would be in unstructured and it may create serious problem for knowledge derivation. Classification can work for better accuracy using Vector Space Model in adaptive manner. Today the internet is broadly used around the world. So the spam in the email or in the machine is the one of the major problem for persons who have attached today's internet life and it causes hardware as well as financial damage to the companies and also to the individual users. Among various approaches developed to stop spam, filtering is an important and popular one. The aim of this paper is to use fuzzy clustering approach for Spam Identification using Fuzzy C-means algorithm, Fuzzy clustering allows each feature vector to belong to more than one cluster with different membership degrees (between 0 and 1) and vague or fuzzy boundaries between clusters. It is applicable for small as well as large datasets. Among the ample of approaches available for classification approach, majority of them are applicable for structured data, such as relational database, Online Transactional Data Processing or Online Analytical Data processing. But, today in the age of internet and its applications, very huge amount of data are being transmitted from one geographical location to another. The form of data would be in unstructured and it may create serious problem for knowledge derivation. Classification can work for better accuracy using Vector Space Model in adaptive manner. Today the internet is broadly used around the world. So the spam in the email or in the machine is the one of the major problem for persons who have attached today's internet life and it causes hardware as well as financial damage to the companies and also to the individual users. Among various approaches developed to stop spam, filtering is an important and popular one. The aim of this paper is to use fuzzy clustering approach for Spam Identification using Fuzzy C-means algorithm, Fuzzy clustering allows each feature vector to belong to more than one cluster with different membership degrees (between 0 and 1) and vague or fuzzy boundaries between clusters. It is applicable for small as well as large datasets.

Other Details

Paper ID: IJSRDV1I3059
Published in: Volume : 1, Issue : 3
Publication Date: 01/06/2013
Page(s): 635-637

Article Preview

Download Article