High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Handling Outliers Efficiently Using Partition Based Clustering Techniques

Author(s):

Tarnnum Khan , Jawaharlal Institute of Technology Borawan Khargone (M.P.) India 451228; Mr. Ranjan Thakur, Jawaharlal Institute of Technology Borawan Khargone (M.P.) India 451228

Keywords:

K-Mean, PAM, Clustering Techniques

Abstract

Outliers are data which can be considered anomalous due to several causes (e.g. erroneous measurements or anomalous process conditions). Outlier detection techniques are used, for instance, to minimize the influence of outliers in the final model to develop, or as a preliminary pre-processing stage before the information conveyed by a signal is elaborated. The traditional outlier detection methods can be classified into four main approaches: distance-based, density-based, clustering- based and distribution-based. In the proposed approach is based on two partition based clustering methods K-Mean and PAM. K Mean method is based on mean values of the object belongs in the cluster. K- mean find the distance of each objects from the mean objects and the object which has minimum distance from the mean object are keep in the cluster otherwise the object are swapped into the cluster whose mean has minimum distance. Some time when outliers are present in the data set we consider these outliers in any one of the cluster. These outliers are affecting the mean value of the clusters. Because of these outliers which are basically not belongs to any of the clusters we have to consider in then ion K-mean. But in case of PAM we choose the objects as the medoid as a cluster center, so when we calculate the distance of other objects with this center objects ,the outliers are easily identified.

Other Details

Paper ID: IJSRDV8I100196
Published in: Volume : 8, Issue : 10
Publication Date: 01/01/2021
Page(s): 484-489

Article Preview

Download Article