Improving the Performance and Efficiency of Convolutional Neural Networks (CNNs) on Image Classification Tasks via Fixed-Point Quantization and Structured Pruning
DOI:
https://doi.org/10.54097/cmc21d96Keywords:
Convolutional Neural Networks, Fixed-point Quantization, Structured Pruning, Model Compression, Image ClassificationAbstract
This study explores the combined use of fixed-point quantization and structured pruning to optimize the performance and efficiency of convolutional neural networks (CNNs) in image classification tasks. These techniques can be used to reduce model size and computational complexity, making CNNs more suitable for deployment in resource-constrained environments such as mobile devices and embedded systems. Fixed-point quantization methods can reduce the bit-width of weights and activations, thereby reducing the computational load and memory footprint. On the other hand, structured pruning systematically removes unimportant convolutional filters or channels, which further reduces the model size and increases the inference speed. An experimental evaluation was performed on the ImageNet dataset using the ResNet-50 architecture. The results show that the combined strategy of quantization and pruning reduces the model size by up to 75% and increases the inference speed by 50%, while maintaining a classification accuracy of 74.5%, compared to 76.4% for the baseline model. Considering the significant increase in efficiency, a slight decrease in accuracy is acceptable. The results show that the integrated approach effectively compresses and accelerates the CNN model without a significant drop in accuracy, making it ideal for real-time applications.
Downloads
References
[1] Azarpeyvand, A., Rokh, B., & Naderi, S. (2023). A comprehensive survey on model quantization for deep neural networks in image classification. ACM Transactions on Intelligent Systems and Technology, 14(6), 1-50.
[2] Cheng, Y., Wang, X., Xie, X., Li, W., & Peng, S. (2022). Channel pruning guided by global channel relation. Applied Intelligence, 42(5), 1234-1250.
[3] Dai, B., Zhu, C., Guo, B., & Wipf, D.P. (2018) An adaptive joint optimization framework for pruning and quantization. In: 35th International Conference on Machine Learning. Stockholm. pp. 1143-1152.
[4] Li, Z., Li, H., & Meng, L. (2023). A survey on deep neural network pruning. Computers, 12(3), 60.
[5] Zacchigna, G. F., Lew, S., & Lutenberg, A. (2024). Flexible quantization for efficient convolutional neural networks. Electronics, 13(10), 1923.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Frontiers in Computing and Intelligent Systems
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.