High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Scalability Analysis and Improvement of Hadoop over H2Hadoop for BigData Analysis

Author(s):

Kalyani Patil , SKNCOE,Pune; Virendra Dakhode, SKNCOE,Pune

Keywords:

Bigdata, Mapreduce, Hadoop, CJB Table

Abstract

Cloud Computing provides different services to the users with regard to processing the Bigdata. The services such as S3 (Simple Storage Service) for storing data, EC2 (Elastic Compute Cloud) to build a private Cloud Computing environment and EMR (Elastic MapReduce) for processing the BigData. Generally EMR uses original Hadoop to process BigData. Beacuse Hadoop is a framework that allows distributed processing of large datasets across clusters of computers using simple programming models. Hence cloud computing leverages Hadoop framework to process BigData in parallel. Hadoop has certain limitations which reduces the job efficiency. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. The challenge remains in Cloud Computing MapReduce platforms is efficient resource allocation. Hence there is an improve system, H2Hadoop. An absolute analysis about H2Hadoop architecture and development is to store the metadata of the executed job. Common Job Blocks (CJB) tables is stored in the name node and metadata of the similar jobs are stored in Name node. Name node helps to direct future jobs to specific Data node that carry the required data sets. The CJB table gets updated with the new common features each time a new job reaches this file. The size of this table should be controlled and limited to a specific size to keep the architecture reliable and efficient. We propose the improved system, which is an enhanced Hadoop architecture that reduces the computation cost associated with BigData analysis. Comparing with H2Hadoop, this system reduces CPU time, number of read operations, and another Hadoop factors for a sequences which has common features.

Other Details

Paper ID: IJSRDV5I41323
Published in: Volume : 5, Issue : 4
Publication Date: 01/07/2017
Page(s): 1815-1819

Article Preview

Download Article