Performance Analysis of Parallel Computing in Image Classification
DOI:
https://doi.org/10.54097/hset.v41i.6739Keywords:
component; parallel computing, data parallel, deep learning; distributed data parallel, image classification.Abstract
The use of parallel computing can speed up the training of deep learning models. The traditional neural network model ResNet is chosen for testing in this paper on the effectiveness of data parallelism in image classification, and test data is provided in a 6-GPU environment. In this paper, it is suggested that various factors should be considered when building affordable hardware configurations to expedite model training in practical application scenarios. The communication costs are not insignificant because today's large computing clusters are primarily offered via cloud computing. Another crucial point to remember is that the number of CUDA cores is the primary hardware foundation for GPU acceleration technology. Therefore, acceleration may not be affected by more incredible video memory or fewer CUDA cores for some particular graphics cards. In addition, the issue of beyond-model performance in parallel computing is another issue that must be disregarded. Due to the parallel strategy's limitations, it is essential to tweak the super parameters while speeding up model training. The model's performance is more likely to be ensured by the lower learning rate and Batch size. This paper's experimental conclusion can support building an appropriate hardware configuration scheme.
Downloads
References
Brownlee, J. (2021, October 11). What is a gradient in machine learning? Machine Learning Mastery. Retrieved September 24, 2022, from https://machinelearningmastery.com/gradient-in-machine-learning/#:~:text=Gradient%20is%20a%20commonly%20used,learning%20algorithms%20use%20gradient%20information.
Chatterjee, P. (2022, April 25). Data parallelism vs. model parallelism - how do they differ in distributed training? Analytics India Magazine. Retrieved September 24, 2022, from https://analyticsindiamag.com/data-parallelism-vs-model-parallelism-how-do-they-differ-in-distributed-training/
Distributeddataparallel. DistributedDataParallel - PyTorch 1.12 documentation. (n.d.). Retrieved September 24, 2022, from https://pytorch.org/docs/stable/generated/torch.nn.parallel.DistributedDataParallel.html
Getting started with distributed data parallel. Getting Started with Distributed Data Parallel - PyTorch Tutorials 1.12.1+cu102 documentation. (n.d.). Retrieved September 24, 2022, from https://pytorch.org/tutorials/intermediate/ddp_tutorial.html#:~:text=Comparison%20between%20DataParallel%20and%20DistributedDataParallel&text=First%2C%20DataParallel%20is%20single%2Dprocess,%2D%20and%20multi%2D%20machine%20training
Gradient descent: An introduction to 1 of Machine Learning's most popular algorithms. Built In. (n.d.). Retrieved September 24, 2022, from https://builtin.com/data-science/gradient-descent
He, K., Zhang, X., Ren, S., & Sun, J. (2015, December 10). Deep residual learning for image recognition. arXiv.org. Retrieved September 24, 2022, from https://arxiv.org/abs/1512.03385
Kathuria, A. (2020, December 18). Intro to optimization in Deep learning: Gradient descent. Paperspace Blog. Retrieved September 24, 2022, from https://blog.paperspace.com/intro-to-optimization-in-deep-learning-gradient-descent/
Mao, L. (2019, May 23). Data parallelism VS model parallelism in distributed deep learning training. Lei Mao's Log Book. Retrieved September 24, 2022, from https://leimao.github.io/blog/Data-Parallelism-vs-Model-Paralelism/
Mishra, A. (2019). Machine learning in the AWS cloud: Add intelligence to applications with Amazon Sagemaker and Amazon Rekognition. Amazon. Retrieved September 24, 2022, from https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-intro.html
Optional: Data parallelism. Optional: Data Parallelism - PyTorch Tutorials 1.12.1+cu102 documentation. (n.d.). Retrieved September 24, 2022, from https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html
Resnet18. resnet18 - Torchvision main documentation. (n.d.). Retrieved September 24, 2022, from https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html
Single-machine model parallel best practices. Single-Machine Model Parallel Best Practices - PyTorch Tutorials 1.12.1+cu102 documentation. (n.d.). Retrieved September 24, 2022, from https://pytorch.org/tutorials/intermediate/model_parallel_tutorial.html
Srinivasan, A. V. (2019, September 7). Stochastic gradient descent - clearly explained!! Medium. Retrieved September 24, 2022, from https://towardsdatascience.com/stochastic-gradient-descent-clearly-explained-53d239905d31
Deeply. (2020, April 22). Dataparallel vs distributeddataparallel. PyTorch Forums. Retrieved October 4, 2022, from https://discuss.pytorch.org/t/dataparallel-vs-distributeddataparallel/77891/3
Adaloglou, N. (2022, April 14). How distributed training works in pytorch: Distributed data-parallel and mixed-precision training. AI Summer. Retrieved October 4, 2022, from https://theaisummer.com/distributed-training-pytorch/
Namespace-Pt. (2021, November 29). A comprehensive tutorial to pytorch distributeddataparallel. Medium. Retrieved October 4, 2022, from https://medium.com/codex/a-comprehensive-tutorial-to-pytorch-distributeddataparallel-1f4b42bb1b51
Residual network. Residual Network - an overview | ScienceDirect Topics. (n.d.). Retrieved October 4, 2022, from https://www.sciencedirect.com/topics/computer-science/residual-network
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.







