High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Predicting Breast Cancer using Apache Spark Machine Learning Logistic Regression

Author(s):

Sujithra. S , SNS College Of Technology

Keywords:

Big Data, Hadoop Framework, Cancer Prediction, Map Reduce

Abstract

In real world Breast Cancer Diagnosis and Prognosis are two medical applications pose a great challenge to the researchers. There are many scientific technologies that has rich information in taking medical decisions but that might not be accurate and properly used to its potential. The use of machine learning and data mining techniques has revolutionized the whole process of breast cancer Diagnosis and Prognosis. The objective of these predictions is to assign patients to either as “benign” group that is noncancerous or a ”malignant” group that is cancerous. In the existing system it is been summarized with different types of data mining algorithms in order to obtain good mortality rate Strong and sophisticated algorithms like Bagging Logistic Regression ,Support Vector Machine, k-Nearest Neighbors algorithm, Decision tree and Artificial Neural Networks have been used and concluded that there was not a single best algorithm depending on the features of the large dataset, it was also a challenge for single-node tools with limited memory and computing power. In proposed system the main goal is to identify the sample observation as malignant or not. And it visualize the data by their exact attributes through logistic regression analysis which improves performance through intelligent optimizations and also to achieve single node analysis through spark framework. Spark not only raises processing speed and real-time performance but also achieves high fault tolerance and high scalability based on in-memory computing.

Other Details

Paper ID: IJSRDV4I110032
Published in: Volume : 4, Issue : 11
Publication Date: 01/02/2017
Page(s): 475-478

Article Preview

Download Article