Enhancing Performance of a Deep Neural Network: A Comparative Analysis of Optimization Algorithms

Abstract

Adopting the most suitable optimization algorithm (optimizer) for a Neural Network Model is among the most important ventures in Deep Learning and all classes of Neural Networks. It’s a case of trial and error experimentation. In this paper, we will experiment with seven of the most popular optimization algorithms namely: sgd, rmsprop, adagrad, adadelta, adam, adamax and nadam on four unrelated datasets discretely, to conclude which one dispenses the best accuracy, efficiency and performance to our deep neural network. This work will provide insightful analysis to a data scientist in choosing the best optimizer while modelling their deep neural network.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Types of Optimization Algorithms used in Neural Networks https://towardsdatascience.com/types-of-optimization-algorithms-used-in-neural-networks-and-ways-to-optimize-gradient-95ae5d39529f

A Tour of Machine Learning Algorithms by Jason Brownlee on August 12 2019 in Machine Learning Algorithms https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/

Y Guo, Y Liu, A Oerlemans, S Lao, S Wu, MS Lew - Deep learning for visual understanding: A review. Neurocomputing, 2016 – Elsevier

A Shrestha, A Mahmood - Review of deep learning algorithms and architectures, IEEE Access, 2019 - ieeexplore.ieee.org

Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu- Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Dept. of Water Resources, River Development and Ganga Rejuvenation; River Data Compilation-2 Directorate August, 2019 Central Water Commission, Ministry of Jal Shakti, New Delhi, India- http://cwc.gov.in/

S. Vani and T. V. M. Rao, An Experimental Approach towards the Performance Assessment of Various Optimizers on Convolutional Neural Network, 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 2019, pp. 331-336, doi: 10.1109/ICOEI.2019.8862686.

Timothy Dozat- Incorporating Nesterov Momentum into Adam

Robbins, H. and Monro, S. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3): 400–407, 1951.

Tieleman, T. and Hinton, G. Lecture 6.5-RMSProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2):26–31, 2012.

Dozat, T. Incorporating Nesterov momentum into Adam. In ICLR Workshops, 2016.

Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl. On Empirical Comparisons of Optimizers for Deep Learning, 2020.

Kingma, D. P. and Ba, J. Adam: a method for stochastic optimization. In ICLR, 2015.

Dean, J., Corrado, G. S., Monga, R., Chen, K., Devin, M., Le, Q. V, Ng, A. Y. (2012). Large Scale Distributed Deep Networks. NIPS 2012: Neural Information Processing Systems, 1–11.

Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. Retrieved from http://arxiv.org/abs/1212.5701

Wilson, A. C., Roelofs, R., Stern, M., Srebro, N., and Recht, B. The marginal value of adaptive gradient methods in machine learning. In Advances in Neural Information Processing Systems 30, pp. 4148–4158. Curran Associates, Inc., 2017.

Sebastian Ruder: An overview of gradient descent optimization algorithms

How to use Learning Curves to Diagnose Machine Learning Model Performance by Jason Brownlee on February 27, 2019 in Deep Learning Performance

Guoji Xu, Qin Chen, Jianhua Chen. “Prediction of Solitary Wave Forces on Coastal Bridge Decks Using Artificial Neural Networks”, Journal of Bridge Engineering, 2018

“Image Analysis and Recognition”, Springer Science and Business Media LLC, 2019

Amirhessam Tahmassebi, Amir H. Gandomi, Simon Fong, Anke Meyer-Baese, Simon Y. Foo. “Multistage optimization of a deep model: A case study on ground motion modeling”, PLOS ONE, 2018

E. M. Dogo, O. J. Afolabi, N. I. Nwulu, B. Twala, C. O. Aigbavboa. “A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks”, 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), 2018
Fatima, N. (2020). Enhancing Performance of a Deep Neural Network: A Comparative Analysis of Optimization Algorithms. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 9(2), 79–90. https://doi.org/10.14201/ADCAIJ2020927990

Downloads

Download data is not yet available.

Author Biography

Noor Fatima

,
Aligarh Muslim University
Undergraduate Student of Department of Computer Science, Aligarh Muslim University, Aligarh
+