A Review of Key Technologies for Deep Learning-Based Autonomous Driving

Mengfei Li

doi:10.54097/c1fad034

Authors

Mengfei Li

DOI:

https://doi.org/10.54097/c1fad034

Keywords:

Deep Learning, Autonomous Driving, Environmental Perception, Trajectory Prediction, End-to-end Learning

Abstract

With the rapid development of artificial intelligence, computer vision, and intelligent transportation technologies, autonomous driving has become an important research direction in academia and industry. Deep learning, with its powerful feature extraction, pattern recognition, and temporal modeling capabilities, has shown significant advantages in autonomous driving environmental perception, trajectory prediction, decision planning, and vehicle control. Convolutional neural networks have played a crucial role in object detection and semantic segmentation tasks, effectively improving the understanding of complex road scenarios by autonomous driving systems. Recurrent structures such as Long Short-Term Memory networks have high application value in temporal modeling and trajectory prediction. Transformer models, with their self-attention mechanism, excel in long-distance dependency modeling, multimodal fusion, and high-level decision planning. Furthermore, end-to-end autonomous driving methods, by integrating perception, decision-making, and control into a unified learning framework, provide new ideas for optimizing the overall performance of autonomous driving systems. This paper reviews the key applications of deep learning in autonomous driving, systematically analyzes its research progress in environmental perception, path decision-making, vehicle control, and end-to-end autonomous driving, and summarizes the main challenges and future development trends, aiming to provide a reference for related research.

Downloads

Download data is not yet available.

References

[1] S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu, “A Survey of Deep Learning Techniques for Autonomous Driving,” Journal of Field Robotics, vol. 37, no. 3, pp. 362–386, 2020.

[2] E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A Survey of Autonomous Driving: Common Practices and Emerging Technologies,” IEEE Access, vol. 8, pp. 58443–58469, 2020.

[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.

[4] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” in International Conference on Learning Representations, 2015.

[5] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

[6] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” in Advances in Neural Information Processing Systems, 2015, pp. 91–99.

[7] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 7263–7271.

[8] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 801–818.

[9] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.

[10] A. Vaswani et al., “Attention Is All You Need,” in Advances in Neural Information Processing Systems, 2017, pp. 5998–6008.

[11] P. S. Chib and P. Singh, “Recent Advancements in End-to-End Autonomous Driving using Deep Learning: A Survey,” arXiv preprint arXiv:2307.04370, 2023.

[12] M. Bojarski et al., “End to End Learning for Self-Driving Cars,” arXiv preprint arXiv:1604.07316, 2016.

[13] B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. Al Sallab, S. Yogamani, and P. Pérez, “Deep Reinforcement Learning for Autonomous Driving: A Survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 6, pp. 4909–4926, 2022.

[14] Z. Li, W. Wang, H. Li, et al., “BEVFormer: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers,” in Proceedings of the European Conference on Computer Vision, 2022.

[15] X. Bai, Z. Hu, X. Zhu, Q. Huang, Y. Chen, H. Fu, and C. Tai, “TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1803–1812.

A Review of Key Technologies for Deep Learning-Based Autonomous Driving

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Cover

CNKI Indexing

Keywords

Latest publications