An Adaptive Variance Reduction Zeroth-Order Algorithm for Finite-Sum Optimization

Yajie Zhu; Mingchuan Zhang

doi:10.54097/fcis.v3i3.8568

Authors

Yajie Zhu
Mingchuan Zhang

DOI:

https://doi.org/10.54097/fcis.v3i3.8568

Keywords:

Zeroth-order Algorithm, Adaptive Algorithm, Variance Reduce, Finite-sum Optimization

Abstract

The unconstrained finite-sum optimization problem is a common type of problem in the field of optimization, and there is currently limited research on zeroth-order optimization algorithms. To solve unconstrained finite-sum optimization problems for non-convex function, we propose a zeroth-order optimization algorithm with adaptive variance reduction, called ZO-AdaSPIDER for short. Then, we analyze the convergence performance of the algorithm. The theoretical results show that ZO-AdaSPIDER algorithm can converge to -stationary point when facing non-convex function, and its convergence rate is .

Downloads

Download data is not yet available.

References

EI-Naggar S, Bourlai T. Exploring Deep Learning Ear Recognition in Thermal Images. IEEE Transactions on Biometrics. Vol. 5 (2023), pp. 64-75.

J. Chen, D. Liu, T. Xu, et al. Is Heuristic Sampling Necessary in Training Deep Object Detectors. IEEE Transactions on Image Processing. Vol. 30 (2021), pp. 8454-8467.

Abdellatif A, Badran K, Costa D E, et al. A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering. IEEE Transactions on Software Engineering. Vol. 48 (2022), pp. 3087-3102.

Ling X, Wu L, Wang S, et al. Multilevel Graph Matching Networks for Deep Graph Similarity Learning. IEEE Transactions on Neural Networks and Learning Systems. Vol. 34 (2023), pp. 799-813.

Feng Z, Hu G. Attack-Resilient Distributed Convex Optimization of Linear Multi-Agent Systems Against Malicious Cyber-Attacks over Random Digraphs. IEEE Internet of Things Journal. Vol. 10 (2023), pp. 458-472.

Robbins H, Monro S. A stochastic approximation method. The Annals of Mathematical Statistics. Vol. 22 (1951), pp. 400-407.

Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research. Vol. 12 (2011), pp. 2121-2159.

Kingma D, Ba J. Adam: A Method for Stochastic Optimization. In the 3rd International Conference for Learning Representations, San Diego, 2015.

Kavis A, Skoulakis S, Antonakopoulos K, et al. Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization. Proceedings of the 36th Conference on Neural Information Processing Systems, 2022.

Tang Y, Ren Z, Li N. Zeroth-order feedback optimization for cooperative multi-agent systems. Automatica. Vol. 148 (2023), pp. 110741.

Cai H, Mckenzie D, Yin W, et al. Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling. Siam Journal on Optimization. Vol. 32 (2022), pp. 687-714.

Zou S, Lygeros J. Semidecentralized Zeroth-Order Algorithms for Stochastic Generalized Nash Equilibrium Seeking. IEEE Transactions on Automatic Control. Vol. 68 (2023), pp. 1237-1244.

Huang F, Gao S, Pei J, et al. Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity. arXiv preprint arXiv:1907.13463, 2019.

Liu S. Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization. Proceedings of the 32rd Conference on Neural Information Processing Systems, 2018.

Ghadimi S, Lan G. Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming. Siam Journal on Optimization. Vol. 23 (2013), pp. 2341-2368.

Ji K, Wang Z, Zhou Y, et al. Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization. Proceedings of the 36th International Conference on Machine Learning, 2019.

Ramezani-Kebrya A, Khisti A, Liang B. On the Generalization of Stochastic Gradient Descent with Momentum. arXiv preprint arXiv:2102.13653, 2021.

Zou F, Shen L, Jie Z, et al. A sufficient condition for convergences of adam and rmsprop. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. (2019). pp. 11127-11135.

Carnevale G, Notarstefano G. Nonconvex Distributed Optimization via Lasalle and Singular Perturbations. IEEE Control Systems Letters. Vol. 7 (2023), pp. 301-306.

Zhou D, Gu Q. Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization. Proceedings of the 39th International Conference on Machine Learning, Baltimore, 2022.

Xu Y, Xu Y. Momentum-Based Variance-Reduced Proximal Stochastic Gradient Method for Composite Nonconvex Stochastic Optimization. Journal of Optimization Theory and Applications. Vol. 196 (2023). pp. 266-297.