Hadoop Distributed File System-Experiment and Analysis for Optimum Performance

Rajeev Kumar Gupta, R. K. Pateriya

Abstract


The size of the data used in today’s enterprises has been growing at exponential rates from last few years. Simultaneously, the need to process and analyze the large volumes of data has also increased. Hadoop is a popular open-source implementation of MapReduce for the analysis of large datasets. To manage and storage resources across the cluster, Hadoop uses a distributed user-level filesystem. This filesystem, HDFS is written in Java and designed to store very large data sets reliably, and to stream those datasets at high bandwidth to user applications .This paper initially deals with the review of HDFS in details. Later on, the paper reports the experimental work of Hadoop with the big data and suggests the various factors on which Hadoop cluster shows an optimal performance. Paper concludes with providing the different real field challenges of Hadoop in recent days.

Full Text:

PDF
Total views : 41 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.