Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability

  • Manish Kumar Gupta
    Department of Information Technology & Computer Application, Madan Mohan Malaviya University of Technology, Gorakhpur, U.P., India, 273010 manish.testing09[at]gmail.com
  • Rajendra Kumar Dwivedi
    Department of Information Technology & Computer Application, Madan Mohan Malaviya University of Technology, Gorakhpur, U.P., India, 273010

Abstract

Hadoop Distributed File System (HDFS) is a distributed file system that allows large amounts of data to be stored and processed across multiple servers in a Hadoop cluster. HDFS also provides high throughput for data access. HDFS enables the management of vast amounts of data using commodity hardware. However, security vulnerabilities in HDFS can be manipulated for malicious purposes. This emphasizes the significance of establishing strong security measures to facilitate file sharing within Hadoop and implementing a reliable mechanism for verifying the legitimacy of shared files. The objective of this paper is to enhance the security of HDFS by utilizing a blockchain-based technique. The proposed model uses the Hyperledger Fabric platform at the enterprise level to leverage metadata of files, thereby establishing dependable security and traceability of data within HDFS. The analysis of results indicates that the proposed model incurs a slightly higher overhead compared to HDFS and requires more storage space. However, this is considered an acceptable trade-off for the improved security.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Androulaki, E., Barger, A., Bortnikov, V., Cachin, C., Christidis, K., De Caro, A., Enyeart, D., Ferris, C., Laventman, G., Manevich, Y., et al., 2018. Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains. Proceedings of the Thirteenth EuroSys Conference, Porto, April 2018, 1–15. 10.1145/3190508.319053

Apache hive, 2013. URL https://hive.apache.org/.

Aujla, G. S.; Chaudhary, R.; Kumar, N.; Das, A. K.; Rodrigues, J. J. P. C., 2018. SecSVA: Secure Storage, Verification, and Auditing of BD in the Cloud Environment. Imminent Communication Technologies for Smart Communities, pp. 78–85. 10.1109/MCOM.2018.1700379

Azzaoui, A. E. L.; Sharma, P. K.; Park, J. H., 2022. Blockchain-based delegated Quantum Cloud architecture for medical big data security. Journal of Network and Computer Applications, 198, 103304. 10.1016/j.jnca.2021.103304

BD Working Group; Cloud Security Alliance (CSA). Expanded Top Ten BD Security and Privacy, 2013, April. Available online:https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_Top_Ten_Big_Data_Security_and_Privacy_Challenges.pdf (accessed on 9 December 2015).

Cachin, C., 2016. Architecture of the Hyperledger Blockchain Fabric. Workshop on Distributed Cryptocurrencies and Consensus Ledgers. github.com/hyperledger/fabric

Cato, P.; Gölzer, P.; Demmelhuber, W., 2015. An investigation into the implementation factors affecting the success of BD systems. In 2015 11th International Conference on Innovations in Information Technology (IIT), pp. 134–139. 10.1109/INNOVATIONS.2015.7381528

Dong, X.; Li, R.; He, H.; Zhou, W., Xue, Z.; Wu, H., 2015. Secure sensitive data sharing on a big data platform. Tsinghua Science and Technology, 20(1), 72–80. 10.1109/TST.2015.7040516

Gantz, J.; Reinsel, D., 2011. Extracting value from chaos- IDC view, 1142, 1–12.

Guan, S.; Zhang, C.; Wang, Y.; Liu, W., 2023. Hadoop-based secure storage solution for big data in cloud computing environment. Digital Communications and Networks. 10.1016/j.dcan.2023.01.014

Gupta, M. K.; Pandey, S. K.; Gupta, A, 2022. HADOOP- An Open Source Framework for BD. In 2022, 3rd International Conference on Intelligent Engineering and Management (ICIEM). 10.1109/ICIEM54221.2022.9853179

Jindal, A.; Kumar, N.; Singh, M., 2020. A unified framework for BD acquisition, storage, and analytics for demand response management in smart cities. Future Generation Computer Systems, 108, pp. 921–934. 10.1016/j.future.2018.02.039

Khalid Yousif, M.; Dallalbashi, Z. E.; Kareem, S. W., 2023. Information security for big data using the NTRUEncrypt method. Measurement: Sensors, 27, 100738. 10.1016/j.measen.2023.100738

Lai, W., et al., 2014. Towards a framework for large-scale multimedia data storage and processing on Hadoop platform. The Journal of Supercomputing, 68(1), 1–20. 10.1007/s11227-013-1050-4

Liu, C. H.; Lin, Q.; Wen, S. 2019. Blockchain-enabled data collection and sharing for industrial IoT with deep reinforcement learning. IEEE Transactions on Industrial Informatics, 15(6), 3516–3526. 10.1109/TII.2018.2890203

Liu, G.; Dong, H.; Yan, Z.; Zhou, X.; Shimizu, S., 2022. B4SDC: A Blockchain System for Security Data Collection in MANETs. IEEE Transactions on Big Data, 8(3), pp. 739–752. 10.1109/TBDATA.2020.2981438

Mohanraj, T.; Santosh, R. 2022. Hybrid Encryption Algorithm for Big Data Security in the Hadoop Distributed File System. Computer Assisted Methods in Engineering and Science, 29(1-2), 33–48. 10.24423/cames.375

Mothukuri, V.; Cheerla, S. S.; Parizi, R. M.; Zhang, Q. “BlockHDFS: Blockchain-integrated Hadoop distributed file system for secure provenance traceability. Blockchain: Research and Applications, 2(1). 10.1016/j.bcra.2021.100032

Peters, G. W.; Panayi, E. 2016. Understanding Modern Banking Ledgers through Blockchain Technologies: Future of Transaction Processing and Smart Contracts on the Internet of Money. In Banking Beyond Banks and Money (pp. 239–278). Springer International Publishing. 10.1007/978-3-319-42448-4_13

Sarosh, P.; Parah, S. A.; Bhat, G. M.; Muhammad, K., 2021. A Security Management Framework for Big Data in Smart Healthcare. Big Data Research, 25, 100225. 10.1016/j.bdr.2021.100225

Sharma, P.; Borah, M. D.; Namasudra, S., 2021. Improving the security of medical big data by using Blockchain technology. Computers & Electrical Engineering, 96, 107529. 10.1016/j.compeleceng.2021.107529

Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R., 2010. The hadoop distributed file system. IEEE 26th Sym- Posium on Mass Storage Systems and Technologies (MSST); 3–7 May 2010; Incline Village, NV, USA, IEEE, Piscataway, NJ, USA, 2010, pp. 1–10. 10.1109/MSST.2010.5496972

Su, Z.; Xu, Q., 2021. Security-aware resource allocation for mobile social BD: A matching-coalitional game solution. IEEE Transactions on BD, 7, 632–642. 10.1109/TBDATA.2017.2700318

Uchibeke, U. U.; Kassani, S. H.; Schneider, K. A.; Deters, R., 2018. Blockchain Access Control Ecosystem for Big Data Security. 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 1373–1378.

Viriyasitavat, W.; Hoonsopon, D., 2019. Blockchain characteristics and consensus in modern business processes. Journal of Industrial Information Integration, 13, 32–39. 10.1016/j.jii.2018.07.004

Wu, J.; Ota, K.; Dong, M.; Li, J.; Wang, H., 2018. BD analysis-based security situational awareness for smart grid. IEEE Transactions on BD, 4(3), 408–417. 10.1109/TBDATA.2016.2616146

Xu, L. D.; Viriyasitavat, W., 2019. Application of blockchain in collaborative internet-of-things services. IEEE Transactions on Computational Social Systems, 6(6), 1295–1305. 10.1109/TCSS.2019.2913165

Xu, X.; Zhang, X.; Gao, H.; Xue, Y.; Qi, L.; Dou, W., 2020. BeCome: Blockchain-enabled computation offloading for IoT in mobile edge computing. IEEE Transactions on Industrial Informatics, 16(6), 4187–4195. 10.1109/TII.2019.2936869

Zaharia, M.; Xin, R. S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M. J. et al., 2016. Apache spark: a unified engine for BD processing, Commun. ACM, 59(11), 56–65. 10.1145/2934664

Zhou, Z.; Wang, M.; Huang, J.; Lin, S.; Lv, Z., 2022. Blockchain in Big Data Security for Intelligent Transportation With 6G. IEEE Transactions on Intelligent Transportation Systems, 23(7), 9736–9746. 10.1109/TITS.2021.3107011
Gupta, M. K., & Dwivedi, R. K. (2023). Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 12(1), e31478. https://doi.org/10.14201/adcaij.31478

Downloads

Download data is not yet available.
+