A Novel Approach to Secure and Encrypt Data Deduplication in Big Data
Recently, there is an increasing demand for storing large amount of data in digital form has become quite challenging task In big data storage, there will be a large amount of duplicate data are presented in the data base. Existing techniques do not improve the performance and efficiency of the system. In this paper the bucket based deduplication technique is introduced, where the big data stream is divided to create fixed size chunks using chucking algorithm. Then the generated chunks are given to the enhanced MD5 algorithm to form hash values for thongs chunks. In order to detect the duplicate hash values in the data base MapReduce is used and the hash values are compared with already stored hash values in the bucket. The Experimental results conclude that the proposed technique outperforms than any other existing techniques and improve the efficiency the system by analyzing the real dataset using Hadoop tool.
Author's Name: Dr.C. Murugamani and Dr.C. Berin Jones
Volume: Volume 14
Issues: Issue 5
Keywords: Big data, Deduplication, Chunking, Hadoop, Map Reduce.