High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Efficient Big Data Modeling for Dataset Mapping, Cleaning and Association Problems

Author(s):

Ambreena Muneer , Global Institute of Technology and Management, Gurgaon; Yasmeen Baqal, Global Institute of Technology and Management, Gurgaon

Keywords:

Hadoop, Big Data, SQOOP, HQL, HDFS

Abstract

Efficient Big Data Modeling for Dataset Mapping, Cleaning & Association problems is a Hadoop based application which analyses the unstructured data of six web browser logs from different countries. This paper intends to provide more useful friendly environment in the various activities such as collecting, storing, managing, analyzing and visualizing data. Analytics platform provides the machines to convert the unstructured data into structured data. It also offers a platform for linking different format of data or tables to analyze and perform desired operations to get appropriate results. This application uses Sqoop as a tool to transfer the data between different medium and also uses HQL to process the data. The final result is stored as a file in HDFS (Datanode) from where the data could be sent to the client.

Other Details

Paper ID: IJSRDV6I60297
Published in: Volume : 6, Issue : 6
Publication Date: 01/09/2018
Page(s): 616-624

Article Preview

Download Article