High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

A Survey on Hadoop as Asolution of Big Data Processing On Cloud

Author(s):

Janvi Pankajbhai Patel , Noble Group of Institutions; Nirali Mankad, Noble Group of Institution

Keywords:

Big Data, Hadoop, HDFS, Map Reduce, Cloud Computing

Abstract

Big data is an emerging paradigm applied to datasets whose size or complexity is beyond the ability of commonly used computer software and hardware tools. One of the key drivers of Big Data is Hadoop, combination of HDFS a distributed file system, MapReduce data processing and Resource Manager (YARN) for allocating resource on cluster for Job execution. Datasets are often from various sources (Variety) unstructured such as social media, sensors, scientific applications, surveillance, video and image archives, Internet texts and documents, Internet search indexing, medical records, business transactions and web logs and are of large size (Volume) with fast data in/out (Velocity). More importantly, big data helps in business decision making (Veracity) and gaining insight in real time which is hard to achieve using traditional system. As estimated that about 40% data globally would be touched with Cloud Computing. Cloud Computing provides strong storage, computation, network and distributed capability in support of Big Data processing. Every entity on cloud is a virtual entity created by underlying virtualization technique. On demand, ease of use, elastic, automated service, PAYG properties helps building proves beneficial for auto scaling the cluster in Hadoop. This paper presents the survey of big data, issues with big data, how Hadoop works.

Other Details

Paper ID: IJSRDV3I60389
Published in: Volume : 3, Issue : 6
Publication Date: 01/09/2015
Page(s): 684-687

Article Preview

Download Article