High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Cost Minimization for Big Data Processing in Geo-Distributed Data Centers


Soniya Saiffulali Khan , MGMCET; Tanmay Ramesh Naik, MGMCET; Ajay Dhananjay Temkar, MGMCET; Sameer Harishchandra Mejari, MGMCET; Prof. Madhuri Patil, MGMCET


Big Data, Data Flow, Data Placement, Distributed Data Centers, Cost Minimization, Task Assignment


The explosive growth of demands on big data processing imposes a heavy burden on computation, storage, and communication in data centers, which hence incurs considerable operational expenditure to data center providers. Therefore, cost minimization has become an emergent issue for the upcoming big data era. Different from conventional cloud services, one of the main features of big data services is the tight coupling between data and computation as computation tasks can be conducted only when the corresponding data is available. As a result, three factors, i.e., task assignment, data placement and data movement, deeply influence the operational expenditure of data centers. In this paper, we are motivated to study the cost minimization problem via a joint optimization of these three factors for big data services in geo-distributed data centers. To describe the task completion time with the consideration of both data transmission and computation, we propose a two-dimensional Markov chain and derive the average task completion time in closed-form. Furthermore, we model the problem as a mixed-integer non-linear programming (MINLP) and propose an efficient solution to linearize it. The high efficiency of our proposal is validated by extensive simulation based studies.

Other Details

Paper ID: IJSRDV7I20149
Published in: Volume : 7, Issue : 2
Publication Date: 01/05/2019
Page(s): 1034-1043

Article Preview

Download Article