Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability
Abstract Hadoop Distributed File System (HDFS) is a distributed file system that allows large amounts of data to be stored and processed across multiple servers in a Hadoop cluster. HDFS also provides high throughput for data access. HDFS enables the management of vast amounts of data using commodity hardware. However, security vulnerabilities in HDFS can be manipulated for malicious purposes. This emphasizes the significance of establishing strong security measures to facilitate file sharing within Hadoop and implementing a reliable mechanism for verifying the legitimacy of shared files. The objective of this paper is to enhance the security of HDFS by utilizing a blockchain-based technique. The proposed model uses the Hyperledger Fabric platform at the enterprise level to leverage metadata of files, thereby establishing dependable security and traceability of data within HDFS. The analysis of results indicates that the proposed model incurs a slightly higher overhead compared to HDFS and requires more storage space. However, this is considered an acceptable trade-off for the improved security.
- Referencias
- Cómo citar
- Del mismo autor
- Métricas
Androulaki, E., Barger, A., Bortnikov, V., Cachin, C., Christidis, K., De Caro, A., Enyeart, D., Ferris, C., Laventman, G., Manevich, Y., et al., 2018. Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains. Proceedings of the Thirteenth EuroSys Conference, Porto, April 2018, 1–15. 10.1145/3190508.319053
Apache hive, 2013. URL https://hive.apache.org/.
Aujla, G. S.; Chaudhary, R.; Kumar, N.; Das, A. K.; Rodrigues, J. J. P. C., 2018. SecSVA: Secure Storage, Verification, and Auditing of BD in the Cloud Environment. Imminent Communication Technologies for Smart Communities, pp. 78–85. 10.1109/MCOM.2018.1700379
Azzaoui, A. E. L.; Sharma, P. K.; Park, J. H., 2022. Blockchain-based delegated Quantum Cloud architecture for medical big data security. Journal of Network and Computer Applications, 198, 103304. 10.1016/j.jnca.2021.103304
BD Working Group; Cloud Security Alliance (CSA). Expanded Top Ten BD Security and Privacy, 2013, April. Available online:https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_Top_Ten_Big_Data_Security_and_Privacy_Challenges.pdf (accessed on 9 December 2015).
Cachin, C., 2016. Architecture of the Hyperledger Blockchain Fabric. Workshop on Distributed Cryptocurrencies and Consensus Ledgers. github.com/hyperledger/fabric
Cato, P.; Gölzer, P.; Demmelhuber, W., 2015. An investigation into the implementation factors affecting the success of BD systems. In 2015 11th International Conference on Innovations in Information Technology (IIT), pp. 134–139. 10.1109/INNOVATIONS.2015.7381528
Dong, X.; Li, R.; He, H.; Zhou, W., Xue, Z.; Wu, H., 2015. Secure sensitive data sharing on a big data platform. Tsinghua Science and Technology, 20(1), 72–80. 10.1109/TST.2015.7040516
Gantz, J.; Reinsel, D., 2011. Extracting value from chaos- IDC view, 1142, 1–12.
Guan, S.; Zhang, C.; Wang, Y.; Liu, W., 2023. Hadoop-based secure storage solution for big data in cloud computing environment. Digital Communications and Networks. 10.1016/j.dcan.2023.01.014
Gupta, M. K.; Pandey, S. K.; Gupta, A, 2022. HADOOP- An Open Source Framework for BD. In 2022, 3rd International Conference on Intelligent Engineering and Management (ICIEM). 10.1109/ICIEM54221.2022.9853179
Jindal, A.; Kumar, N.; Singh, M., 2020. A unified framework for BD acquisition, storage, and analytics for demand response management in smart cities. Future Generation Computer Systems, 108, pp. 921–934. 10.1016/j.future.2018.02.039
Khalid Yousif, M.; Dallalbashi, Z. E.; Kareem, S. W., 2023. Information security for big data using the NTRUEncrypt method. Measurement: Sensors, 27, 100738. 10.1016/j.measen.2023.100738
Lai, W., et al., 2014. Towards a framework for large-scale multimedia data storage and processing on Hadoop platform. The Journal of Supercomputing, 68(1), 1–20. 10.1007/s11227-013-1050-4
Liu, C. H.; Lin, Q.; Wen, S. 2019. Blockchain-enabled data collection and sharing for industrial IoT with deep reinforcement learning. IEEE Transactions on Industrial Informatics, 15(6), 3516–3526. 10.1109/TII.2018.2890203
Liu, G.; Dong, H.; Yan, Z.; Zhou, X.; Shimizu, S., 2022. B4SDC: A Blockchain System for Security Data Collection in MANETs. IEEE Transactions on Big Data, 8(3), pp. 739–752. 10.1109/TBDATA.2020.2981438
Mohanraj, T.; Santosh, R. 2022. Hybrid Encryption Algorithm for Big Data Security in the Hadoop Distributed File System. Computer Assisted Methods in Engineering and Science, 29(1-2), 33–48. 10.24423/cames.375
Mothukuri, V.; Cheerla, S. S.; Parizi, R. M.; Zhang, Q. “BlockHDFS: Blockchain-integrated Hadoop distributed file system for secure provenance traceability. Blockchain: Research and Applications, 2(1). 10.1016/j.bcra.2021.100032
Peters, G. W.; Panayi, E. 2016. Understanding Modern Banking Ledgers through Blockchain Technologies: Future of Transaction Processing and Smart Contracts on the Internet of Money. In Banking Beyond Banks and Money (pp. 239–278). Springer International Publishing. 10.1007/978-3-319-42448-4_13
Sarosh, P.; Parah, S. A.; Bhat, G. M.; Muhammad, K., 2021. A Security Management Framework for Big Data in Smart Healthcare. Big Data Research, 25, 100225. 10.1016/j.bdr.2021.100225
Sharma, P.; Borah, M. D.; Namasudra, S., 2021. Improving the security of medical big data by using Blockchain technology. Computers & Electrical Engineering, 96, 107529. 10.1016/j.compeleceng.2021.107529
Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R., 2010. The hadoop distributed file system. IEEE 26th Sym- Posium on Mass Storage Systems and Technologies (MSST); 3–7 May 2010; Incline Village, NV, USA, IEEE, Piscataway, NJ, USA, 2010, pp. 1–10. 10.1109/MSST.2010.5496972
Su, Z.; Xu, Q., 2021. Security-aware resource allocation for mobile social BD: A matching-coalitional game solution. IEEE Transactions on BD, 7, 632–642. 10.1109/TBDATA.2017.2700318
Uchibeke, U. U.; Kassani, S. H.; Schneider, K. A.; Deters, R., 2018. Blockchain Access Control Ecosystem for Big Data Security. 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 1373–1378.
Viriyasitavat, W.; Hoonsopon, D., 2019. Blockchain characteristics and consensus in modern business processes. Journal of Industrial Information Integration, 13, 32–39. 10.1016/j.jii.2018.07.004
Wu, J.; Ota, K.; Dong, M.; Li, J.; Wang, H., 2018. BD analysis-based security situational awareness for smart grid. IEEE Transactions on BD, 4(3), 408–417. 10.1109/TBDATA.2016.2616146
Xu, L. D.; Viriyasitavat, W., 2019. Application of blockchain in collaborative internet-of-things services. IEEE Transactions on Computational Social Systems, 6(6), 1295–1305. 10.1109/TCSS.2019.2913165
Xu, X.; Zhang, X.; Gao, H.; Xue, Y.; Qi, L.; Dou, W., 2020. BeCome: Blockchain-enabled computation offloading for IoT in mobile edge computing. IEEE Transactions on Industrial Informatics, 16(6), 4187–4195. 10.1109/TII.2019.2936869
Zaharia, M.; Xin, R. S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M. J. et al., 2016. Apache spark: a unified engine for BD processing, Commun. ACM, 59(11), 56–65. 10.1145/2934664
Zhou, Z.; Wang, M.; Huang, J.; Lin, S.; Lv, Z., 2022. Blockchain in Big Data Security for Intelligent Transportation With 6G. IEEE Transactions on Intelligent Transportation Systems, 23(7), 9736–9746. 10.1109/TITS.2021.3107011
Apache hive, 2013. URL https://hive.apache.org/.
Aujla, G. S.; Chaudhary, R.; Kumar, N.; Das, A. K.; Rodrigues, J. J. P. C., 2018. SecSVA: Secure Storage, Verification, and Auditing of BD in the Cloud Environment. Imminent Communication Technologies for Smart Communities, pp. 78–85. 10.1109/MCOM.2018.1700379
Azzaoui, A. E. L.; Sharma, P. K.; Park, J. H., 2022. Blockchain-based delegated Quantum Cloud architecture for medical big data security. Journal of Network and Computer Applications, 198, 103304. 10.1016/j.jnca.2021.103304
BD Working Group; Cloud Security Alliance (CSA). Expanded Top Ten BD Security and Privacy, 2013, April. Available online:https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Expanded_Top_Ten_Big_Data_Security_and_Privacy_Challenges.pdf (accessed on 9 December 2015).
Cachin, C., 2016. Architecture of the Hyperledger Blockchain Fabric. Workshop on Distributed Cryptocurrencies and Consensus Ledgers. github.com/hyperledger/fabric
Cato, P.; Gölzer, P.; Demmelhuber, W., 2015. An investigation into the implementation factors affecting the success of BD systems. In 2015 11th International Conference on Innovations in Information Technology (IIT), pp. 134–139. 10.1109/INNOVATIONS.2015.7381528
Dong, X.; Li, R.; He, H.; Zhou, W., Xue, Z.; Wu, H., 2015. Secure sensitive data sharing on a big data platform. Tsinghua Science and Technology, 20(1), 72–80. 10.1109/TST.2015.7040516
Gantz, J.; Reinsel, D., 2011. Extracting value from chaos- IDC view, 1142, 1–12.
Guan, S.; Zhang, C.; Wang, Y.; Liu, W., 2023. Hadoop-based secure storage solution for big data in cloud computing environment. Digital Communications and Networks. 10.1016/j.dcan.2023.01.014
Gupta, M. K.; Pandey, S. K.; Gupta, A, 2022. HADOOP- An Open Source Framework for BD. In 2022, 3rd International Conference on Intelligent Engineering and Management (ICIEM). 10.1109/ICIEM54221.2022.9853179
Jindal, A.; Kumar, N.; Singh, M., 2020. A unified framework for BD acquisition, storage, and analytics for demand response management in smart cities. Future Generation Computer Systems, 108, pp. 921–934. 10.1016/j.future.2018.02.039
Khalid Yousif, M.; Dallalbashi, Z. E.; Kareem, S. W., 2023. Information security for big data using the NTRUEncrypt method. Measurement: Sensors, 27, 100738. 10.1016/j.measen.2023.100738
Lai, W., et al., 2014. Towards a framework for large-scale multimedia data storage and processing on Hadoop platform. The Journal of Supercomputing, 68(1), 1–20. 10.1007/s11227-013-1050-4
Liu, C. H.; Lin, Q.; Wen, S. 2019. Blockchain-enabled data collection and sharing for industrial IoT with deep reinforcement learning. IEEE Transactions on Industrial Informatics, 15(6), 3516–3526. 10.1109/TII.2018.2890203
Liu, G.; Dong, H.; Yan, Z.; Zhou, X.; Shimizu, S., 2022. B4SDC: A Blockchain System for Security Data Collection in MANETs. IEEE Transactions on Big Data, 8(3), pp. 739–752. 10.1109/TBDATA.2020.2981438
Mohanraj, T.; Santosh, R. 2022. Hybrid Encryption Algorithm for Big Data Security in the Hadoop Distributed File System. Computer Assisted Methods in Engineering and Science, 29(1-2), 33–48. 10.24423/cames.375
Mothukuri, V.; Cheerla, S. S.; Parizi, R. M.; Zhang, Q. “BlockHDFS: Blockchain-integrated Hadoop distributed file system for secure provenance traceability. Blockchain: Research and Applications, 2(1). 10.1016/j.bcra.2021.100032
Peters, G. W.; Panayi, E. 2016. Understanding Modern Banking Ledgers through Blockchain Technologies: Future of Transaction Processing and Smart Contracts on the Internet of Money. In Banking Beyond Banks and Money (pp. 239–278). Springer International Publishing. 10.1007/978-3-319-42448-4_13
Sarosh, P.; Parah, S. A.; Bhat, G. M.; Muhammad, K., 2021. A Security Management Framework for Big Data in Smart Healthcare. Big Data Research, 25, 100225. 10.1016/j.bdr.2021.100225
Sharma, P.; Borah, M. D.; Namasudra, S., 2021. Improving the security of medical big data by using Blockchain technology. Computers & Electrical Engineering, 96, 107529. 10.1016/j.compeleceng.2021.107529
Shvachko, K.; Kuang, H.; Radia, S.; Chansler, R., 2010. The hadoop distributed file system. IEEE 26th Sym- Posium on Mass Storage Systems and Technologies (MSST); 3–7 May 2010; Incline Village, NV, USA, IEEE, Piscataway, NJ, USA, 2010, pp. 1–10. 10.1109/MSST.2010.5496972
Su, Z.; Xu, Q., 2021. Security-aware resource allocation for mobile social BD: A matching-coalitional game solution. IEEE Transactions on BD, 7, 632–642. 10.1109/TBDATA.2017.2700318
Uchibeke, U. U.; Kassani, S. H.; Schneider, K. A.; Deters, R., 2018. Blockchain Access Control Ecosystem for Big Data Security. 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 1373–1378.
Viriyasitavat, W.; Hoonsopon, D., 2019. Blockchain characteristics and consensus in modern business processes. Journal of Industrial Information Integration, 13, 32–39. 10.1016/j.jii.2018.07.004
Wu, J.; Ota, K.; Dong, M.; Li, J.; Wang, H., 2018. BD analysis-based security situational awareness for smart grid. IEEE Transactions on BD, 4(3), 408–417. 10.1109/TBDATA.2016.2616146
Xu, L. D.; Viriyasitavat, W., 2019. Application of blockchain in collaborative internet-of-things services. IEEE Transactions on Computational Social Systems, 6(6), 1295–1305. 10.1109/TCSS.2019.2913165
Xu, X.; Zhang, X.; Gao, H.; Xue, Y.; Qi, L.; Dou, W., 2020. BeCome: Blockchain-enabled computation offloading for IoT in mobile edge computing. IEEE Transactions on Industrial Informatics, 16(6), 4187–4195. 10.1109/TII.2019.2936869
Zaharia, M.; Xin, R. S.; Wendell, P.; Das, T.; Armbrust, M.; Dave, A.; Meng, X.; Rosen, J.; Venkataraman, S.; Franklin, M. J. et al., 2016. Apache spark: a unified engine for BD processing, Commun. ACM, 59(11), 56–65. 10.1145/2934664
Zhou, Z.; Wang, M.; Huang, J.; Lin, S.; Lv, Z., 2022. Blockchain in Big Data Security for Intelligent Transportation With 6G. IEEE Transactions on Intelligent Transportation Systems, 23(7), 9736–9746. 10.1109/TITS.2021.3107011
Gupta, M. K., & Dwivedi, R. K. (2023). Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 12(1), e31478. https://doi.org/10.14201/adcaij.31478
Most read articles by the same author(s)
- Manish Kumar Gupta, Rajendra Kumar Dwivedi, Beaf:BD – A Blockchain Enabled Authentication Framework for Big Data , ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal: Vol. 12 (2023)
Downloads
Download data is not yet available.
+
−