The Study on Recognition and Detection of Express Package Grabbing Based on Machine Vision


  • Weixiang Qi
  • Xiangguo Sun



ResNet, DenseNet, Deep learning, Convolutional neural networks, Courier parcels, Visual recognition of gripping surfaces.


This research addresses the automated supply of parcels in logistics by employing a robotic arm to replace manual intervention. The process involves the suction-based retrieval of disorderedly stacked parcels and their placement onto a conveyor belt, presenting an economically significant engineering challenge. This paper designs a visual recognition system to assist the robotic arm in identifying the grasping surface of disorderedly stacked parcels. Leveraging transfer learning, a pre-trained model constructs a multimodal network framework. Pre-trained models of ResNet 169, DenseNet121, ResNet101, and ResNet50 serve as backbone networks to train and test on custom parcel datasets. Post comparative testing of model performance, DenseNet169 is chosen to construct the visual recognition network. Specifically, the RGB and Depth data of parcels are separately fed into DenseNet169 for feature extraction. Post-feature extraction, multimodal fusion is applied, integrating a lightweight attention mechanism, CBAM, to enhance semantic segmentation accuracy. Subsequently, post-processing of image features filters out the background, achieving precise identification of the parcel grasping region. Ultimately, the constructed network achieves an average TOP1 true class accuracy of 95.86% on the test set. Thus, the designed parcel visual recognition system based on this methodology exhibits robustness, meeting the demand for autonomous parcel retrieval by the robotic arm. It offers significant support towards resolving challenges in automated parcel supply within logistics.


Download data is not yet available.


Bai Wenjie. Research on Logistics Sorting and Planning System Based on Deep Learning of Express Delivery Waybill [D]. Anhui University of Science and Technology, 2021.

Wang C, Bai X, Wang X, et al. Self-supervised multiscale adversarial regression network for stereo disparity estimation[J]. IEEE Transactions on Cybernetics, 2020, 51(10): 4770-4783.

Zhang J, Xie Z, Sun J, et al. A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection[J]. IEEE access, 2020, 8: 29742-29754.

Alasmari N, Alohali M A, Khalid M, et al. Improved metaheuristics with deep learning based object detector for intelligent control in autonomous vehicles[J]. Computers, 2023, 108: 108718.

Localization Algorithm and System Design for Vision-Based Navigation AGV Journal of Qingdao University(Natural Science Edition), 2022, 35: 83-91.

He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770-778.

Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2017: 4700-4708.

Eitel A, Springenberg J T, Spinello L, et al. Multimodal deep learning for robust RGB-D object recognition[C]. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015: 681-687.

Wang Y, Wang C, Long P, et al. Recent advances in 3D object detection based on RGB-D: A survey[J]. Displays, 2021, 70: 102077.

Rahman M M, Tan Y, Xue J, et al. Notice of violation of IEEE publication principles: Recent advances in 3D object detection in the era of deep neural networks: A survey[J]. IEEE Transactions on image processing, 2019, 29: 2947-2962.

Arnold E, Al-Jarrah O Y, Dianati M, et al. A survey on 3d object detection methods for autonomous driving applications[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(10): 3782-3795.

Kruglyak L, Lander E S. Complete multipoint sib-pair analysis of qualitative and quantitative traits[J]. American journal of human genetics, 1995, 57(2): 439.

Vidal J, Lin C-Y, Martí R. 6D pose estimation using an improved method based on point pair features[C]. 2018 4th international conference on control, automation and robotics (iccar), 2018: 405-409.

Zeng A, Song S, Yu K-T, et al. Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching[J]. International Journal of Robotics Research, 2022, 41(7): 690-705.

Woo S, Park J, Lee J-Y, et al. Cbam: Convolutional block attention module[C]. Proceedings of the European conference on computer vision (ECCV), 2018: 3-19.







How to Cite

The Study on Recognition and Detection of Express Package Grabbing Based on Machine Vision. (2024). Academic Journal of Science and Technology, 9(1), 198-203.

Similar Articles

1-10 of 275

You may also start an advanced similarity search for this article.