A Review Paper on Hadoop |
Author(s): |
Payal Gothi , Atharva College of Engineering, Mumbai; Riddhi Kamat, Atharva College of Engineering, Mumbai |
Keywords: |
Big Data, Hadoop, HDFS, MapReduce |
Abstract |
The term BIG DATA is used to describe the collection of complex and large data sets such that it is difficult to store, process and analyze this kind of data using conventional database management tools and traditional databases management systems. Earlier RDBMS systems tried to handle this unstructured and semistructred large chunk of data but couldn’t handle the same. So hadoop came into existence. In this paper we present Hadoop and its core concepts which are HDFS and MapReduce. Hadoop is one of the few frameworks that support storing of unstructured data like video files, audio files, image files etc. along with storing the normal structured data. The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. MapReduce in Hadoop is a framework for writing applications that process large amounts of structured and unstructured data in parallel across a cluster of thousands of machines, in a reliable and fault tolerant manner. |
Other Details |
Paper ID: NCTAAP011 Published in: Conference 4 : NCTAA 2016 Publication Date: 29/01/2016 Page(s): 44-46 |
Article Preview |
Download Article |
|