ADDP: The Data Prefetching Protocol for Monitoring Capacity Misses

Swapnita Srivastava; P. K. Singh

doi:10.14201/adcaij.31782

ADDP: The Data Prefetching Protocol for Monitoring Capacity Misses

Swapnita Srivastava

Department of Computer Science and Engineering, Madan Mohan Malaviya University. Deoria Road, Singhariya, Kunraghat, Gorakhpur, Uttar Pradesh 273016, India swapnitasrivastava[at]gmail.com
P. K. Singh

Department of Computer Science and Engineering, Madan Mohan Malaviya University. Deoria Road, Singhariya, Kunraghat, Gorakhpur, Uttar Pradesh 273016, India

Resumen

Prefetching is essential to minimizing the number of misses in cache and improving processor performance. Many prefetchers have been proposed, including simple but highly effective stream-based prefetchers and prefetchers that predict complex access patterns based on structures such as history buffers and bit vectors. However, many cache misses still occur in many applications. After analyzing the various techniques in Instruction and Data Prefetcher, several key features were extracted which impact system performance. Data prefetching is an essential technique used in all commercial processors. Data prefetchers aim at hiding the long data access latency. In this paper, we present the design of an Adaptive Delta-based Data Prefetching (ADDP) that employs four different tables organized in a hierarchical manner to address the diversity of access patterns. Firstly, the Entry Table is queue, which tracks recent cache fill. Secondly, the Predict Table which has trigger (Program Counter) PCs as tags. Thirdly, the (Address Difference Table) ADT which has target PCs as tags. Lastly, the Prefetch Table is divided into two parts, i.e., Prefetch Filter and the actual Prefetch Table. The Prefetch Filter table filters unnecessary prefetch accesses and the Prefetch Table is used to track other additional information for each prefetch. The ADDP has been implemented in a multicache-level prefetching system under the 3rd Data Prefetching Championship (DPC-3) framework. ADDP is an effective solution for data-intensive applications since it shows notable gains in cache hit rates and latency reduction. The simulation results show that ADDP outperforms the top three data prefetchers MLOP, SPP and BINGO by 5.312 %, 13.213 % and 10.549 %, respectively.

Referencias
Cómo citar
Del mismo autor
Métricas

Abella, J., González, A., Vera, X., & O'Boyle, M. F. (2005). IATAC: A smart predictor to turn off L2 cache lines. ACM Transactions on Architecture and Code Optimization (TACO), 2(1), 55-77. https://doi.org/10.1145/1061277.1061282

Akram, A., & Sawalha, L. (2019). A survey of computer architecture simulation techniques and tools. IEEE Access, 7, 78120-78145. https://doi.org/10.1109/ACCESS.2019.2921799

Aleem, M., Islam, M. A., & Iqbal, M. A. (2016). A comparative study of heterogeneous processor simulators. International Journal of Computer Applications, 148(12). https://doi.org/10.5120/ijca2016911332

Bakhshalipour, M., Shakerinava, M., Lotfi-Kamran, P., & Sarbazi-Azad, H. (2019). Accurately and maximally prefetching spatial data access patterns with bingo. The Third Data Prefetching Championship. https://doi.org/10.1145/3309200.3309204

Bao, W., Krishnamoorthy, S., Pouchet, L. N., & Sadayappan, P. (2017). Analytical modeling of cache behavior for affine programs. Proceedings of the ACM on Programming Languages, 2(POPL), 1-26. https://doi.org/10.1145/3158103

Bera, R., Kanellopoulos, K., Nori, A., Shahroodi, T., Subramoney, S., & Mutlu, O. (2021, October). Pythia: A customizable hardware prefetching framework using online reinforcement learning. En MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 1121-1137). https://doi.org/10.1145/3466752.3480071

Borgström, G., Sembrant, A., & Black-Schaffer, D. (2017, January). Adaptive cache warming for faster simulations. En Proceedings of the 9th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (pp. 1-7). https://doi.org/10.1145/3156403.3156404

Butt, A. R., Gniady, C., & Hu, Y. C. (2007). The performance impact of kernel prefetching on buffer cache replacement algorithms. IEEE Transactions on Computers, 56(7), 889-908. https://doi.org/10.1109/TC.2007.1048

Byna, S., Chen, Y., & Sun, X. H. (2008, May). A taxonomy of data prefetching mechanisms. En 2008 International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN 2008) (pp. 19-24). IEEE. https://doi.org/10.1109/I-SPAN.2008.4549991

Duong, N., Zhao, D., Kim, T., Cammarota, R., Valero, M., & Veidenbaum, A. V. (2012, December). Improving cache management policies using dynamic reuse distances. En 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 389-400). IEEE. https://doi.org/10.1109/MICRO.2012.42

GitHub - ChampSim/ChampSim: ChampSim repository. (s.f.). Recuperado el 14 de abril de 2022, de https://github.com/ChampSim/ChampSim

Hosseinzadeh, M., Moghim, N., Taheri, S., & Gholami, N. (2024). A new cache replacement policy in named data network based on FIB table information. Telecommunication Systems, 1-12. https://doi.org/10.1007/s11235-023-01012-5

Johnson, T., & Shasha, D. (1994, September). 2Q: A low overhead high-performance buffer management replacement algorithm. En Proceedings of the 20th International Conference on Very Large Data Bases (pp. 439-450). https://doi.org/10.5555/645920.672996

Kim, J., Pugsley, S. H., Gratz, P. V., Reddy, A. N., Wilkerson, C., & Chishti, Z. (2016, October). Path confidence-based lookahead prefetching. En 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (pp. 1-12). IEEE. https://doi.org/10.1109/MICRO.2016.7783725

Li, X., Sun, H., & Huang, Y. (2024). Efficient flow table caching architecture and replacement policy for SDN switches. Journal of Network and Systems Management, 32(3), 60. https://doi.org/10.1007/s10922-024-09685-6

Liu, E., Hashemi, M., Swersky, K., Ranganathan, P., & Ahn, J. (2020, November). An imitation learning approach for cache replacement. En International Conference on Machine Learning (pp. 6237-6247). PMLR. https://proceedings.mlr.press/v119/liu20g.html

Long, X., Gong, X., Zhang, B., & Zhou, H. (2023). Deep learning-based data prefetching in CPU-GPU unified virtual memory. Journal of Parallel and Distributed Computing, 174, 19-31. https://doi.org/10.1016/j.jpdc.2023.01.005

Luo, H., Li, P., & Ding, C. (2017). Thread data sharing in cache: Theory and measurement. ACM SIGPLAN Notices, 52(8), 103-115. https://doi.org/10.1145/3155284.2790007

Nakamura, T., Koizumi, T., Degawa, Y., Irie, H., Sakai, S., & Shioya, R. (2020). D-jolt: Distant jolt prefetcher. The 1st Instruction Prefetching Championship (IPC1). https://doi.org/10.1145/3400302.3415760

Pakalapati, S., & Panda, B. (2019). Bouquet of instruction pointers: Instruction pointer classifier-based hardware prefetching. The 3rd Data Prefetching Championship. https://doi.org/10.1145/3309200.3309205

Patterson, D. A. (2006). Future of computer architecture. Berkeley EECS Annual Research Symposium (BEARS2006). University of California at Berkeley, CA. https://doi.org/10.1109/BEARS.2006.1704817

Pugsley, S. H., Chishti, Z., Wilkerson, C., Chuang, P. F., Scott, R. L., Jaleel, A., & Balasubramonian, R. (2014, February). Sandbox prefetching: Safe run-time evaluation of aggressive prefetchers. En 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA) (pp. 626-637). IEEE. https://doi.org/10.1109/HPCA.2014.6835962

Qureshi, M. K., & Patt, Y. N. (2006, December). Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. En 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06) (pp. 423-432). IEEE. https://doi.org/10.1109/MICRO.2006.49

Seznec, A. (2020, May). The FNL+ MMA instruction cache prefetcher. En IPC-1-First Instruction Prefetching Championship (pp. 1-5). https://doi.org/10.1145/3400302.3415761

Shakerinava, M., Bakhshalipour, M., Lotfi-Kamran, P., & Sarbazi-Azad, H. (2019). Multi-lookahead offset prefetching. The Third Data Prefetching Championship. https://doi.org/10.1145/3309200.3309206

Souza, M. A., & Freitas, H. C. (2024). Reinforcement learning-based cache replacement policies for multicore processors. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3314805

SPEC CPU® 2017. (s.f.). Recuperado el 14 de abril de 2022, de https://www.spec.org/cpu2017/

Srinath, S., Mutlu, O., Kim, H., & Patt, Y. N. (2007, February). Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers. En 2007 IEEE 13th International Symposium on High Performance Computer Architecture (pp. 63-74). IEEE. https://doi.org/10.1109/HPCA.2007.346203

Srivastava, S., & Sharma, S. (2019, January). Analysis of cyber-related issues by implementing data mining algorithm. En 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (pp. 606-610). IEEE. https://doi.org/10.1109/CONFLUENCE.2019.8776900

Third Data Prefetching Championship | SIGARCH. (s.f.). Recuperado el 14 de abril de 2022, de https://www.sigarch.org/call-contributions/third-data-prefetching-championship/

Wang, Z., Hu, J., Min, G., Zhao, Z., & Wang, Z. (2024). Agile cache replacement in edge computing via offline-online deep reinforcement learning. IEEE Transactions on Parallel and Distributed Systems, 35(4), 663-674. https://doi.org/10.1109/TPDS.2024.3320449

Wu, C. J., & Martonosi, M. (2011, April). Characterization and dynamic mitigation of intra-application cache interference. En IEEE ISPASS: IEEE International Symposium on Performance Analysis of Systems and Software (pp. 2-11). IEEE. https://doi.org/10.1109/ISPASS.2011.5762690

Young, V., Chou, C. C., Jaleel, A., & Qureshi, M. (2017, June). Ship++: Enhancing signature-based hit predictor for improved cache performance. En Proceedings of the Cache Replacement Championship (CRC’17) held in conjunction with the International Symposium on Computer Architecture (ISCA’17). https://doi.org/10.1145/3109904.3109913

Yovita, L. V., Wibowo, T. A., Ramadha, A. A., Satriawan, G. P., & Raniprima, S. (2022). Performance analysis of cache replacement algorithm using virtual named data network nodes. Jurnal Online Informatika, 7(2), 203-210. https://doi.org/10.15575/join.v7i2.1689

Srivastava, S., & Singh, P. K. (2025). ADDP: The Data Prefetching Protocol for Monitoring Capacity Misses. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 14, e31782. https://doi.org/10.14201/adcaij.31782

Descargar cita

Artículos más leídos del mismo autor/a

Mohammad Faiz, Bakkanarappa Gari Mounika, Mohd Akbar, Swapnita Srivastava, Deep and Machine Learning for Acute Lymphoblastic Leukemia Diagnosis: A Comprehensive Review , ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal: Vol. 13 (2024)
Ashok Kumar Rai, Lalit Kumar Tyagi, Anoop Kumar, Swapnita Srivastava, Naushen Fatima, Enhancing Energy Efficiency in Cluster Based WSN using Grey Wolf Optimization , ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal: Vol. 12 (2023)

Descargas

Los datos de descargas todavía no están disponibles.

+ −

Fechas editoriales

Enviado:

2023-11-08

Aceptado:

2025-02-18

Publicado:

2025-04-11

Número

Vol. 14 (2025)

Sección

Artículos

Palabras clave

data prefetcher
instruction prefetcher
multi-core
instruction per cycle
coverage

Agencias de apoyo

Esta investigación no contó con financiación

Licencia

Derechos de autor 2025 Swapnita Srivastava , P K Singh

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial-SinDerivadas 4.0.