Handling Outliers Efficiently Using Partition Based Clustering Techniques |
Author(s): |
| Tarnnum Khan , Jawaharlal Institute of Technology Borawan Khargone (M.P.) India 451228; Mr. Ranjan Thakur, Jawaharlal Institute of Technology Borawan Khargone (M.P.) India 451228 |
Keywords: |
| K-Mean, PAM, Clustering Techniques |
Abstract |
|
Outliers are data which can be considered anomalous due to several causes (e.g. erroneous measurements or anomalous process conditions). Outlier detection techniques are used, for instance, to minimize the influence of outliers in the final model to develop, or as a preliminary pre-processing stage before the information conveyed by a signal is elaborated. The traditional outlier detection methods can be classified into four main approaches: distance-based, density-based, clustering- based and distribution-based. In the proposed approach is based on two partition based clustering methods K-Mean and PAM. K Mean method is based on mean values of the object belongs in the cluster. K- mean find the distance of each objects from the mean objects and the object which has minimum distance from the mean object are keep in the cluster otherwise the object are swapped into the cluster whose mean has minimum distance. Some time when outliers are present in the data set we consider these outliers in any one of the cluster. These outliers are affecting the mean value of the clusters. Because of these outliers which are basically not belongs to any of the clusters we have to consider in then ion K-mean. But in case of PAM we choose the objects as the medoid as a cluster center, so when we calculate the distance of other objects with this center objects ,the outliers are easily identified. |
Other Details |
|
Paper ID: IJSRDV8I100196 Published in: Volume : 8, Issue : 10 Publication Date: 01/01/2021 Page(s): 484-489 |
Article Preview |
|
|
|
|
