DETR 3D Object Detection Method Based on Fusion of Depth and Salient Information
DOI: 10.23977/jeis.2023.080102 | Downloads: 21 | Views: 606
Author(s)
Yonggui Wang 1, Jian Li 1, Zaicheng Zhang 1, Bin He 2
Affiliation(s)
1 School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi'an, Shaanxi, 710016, China
2 School of Electronics and Information Engineering, Tongji University, Shanghai, 200000, China
Corresponding Author
Yonggui WangABSTRACT
Most of the existing monocular 3D object detection algorithms combine geometric relationships and convolutional neural networks to predict the 3D attributes of the object, lacking depth feature information and global relationship of features. To solve these problems, a DETR monocular 3D object detection algorithm combining depth and salient information is proposed. A lightweight unsupervised depth module is constructed to extract object depth feature information, and Transformer model is introduced to obtain the global relationship of features. In addition, aiming at the high computational cost of Transformer model in the algorithm, a remarkable network is designed to reduce the computational load of Transformer encoder. The experimental results in KITTI official dataset show that the proposed algorithm achieves the optimal detection accuracy in multiple indicators compared with other current advanced detection algorithms, and the effectiveness of each module in the algorithm is proved through ablation experiments.
KEYWORDS
Monocular 3D Object Detection, Depth Module, Global Relationship of Features, Transformer, Saliency NetworkCITE THIS PAPER
Yonggui Wang, Jian Li, Zaicheng Zhang, Bin He, DETR 3D Object Detection Method Based on Fusion of Depth and Salient Information. Journal of Electronics and Information Science (2023) Vol. 8: 9-19. DOI: http://dx.doi.org/10.23977/10.23977/jeis.2023.080102.
REFERENCES
[1] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to-end object detection with transformers. In ECCV, 2020.
[2] Li B, Ouyang W, Sheng L, et al. GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving[J]. 2019.
[3] Brazil G., Liu X: M3D-RPN: Monocular 3D region proposal network for object detection. In: ICCV, 2019.
[4] Liu Z, Zhou D, Lu F, et al. Autoshape: Real-time shape-aware monocular 3d object detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 15641-15650.
[5] Park D, Ambrus R, Guizilini V, et al. Is pseudo-lidar needed for monocular 3d object detection?[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 3142-3152.
[6] Li Y, Chen Y, He J, et al. Densely Constrained Depth Estimator for Monocular 3D Object Detection[C]. European Conference on Computer Vision. Springer, Cham, 2022: 718-734.
[7] Zhu X, Su W, Lu L, et al. Deformable detr: Deformable transformers for end-to-end object detection[J]. arXiv preprint arXiv:2010.04159, 2020.
[8] Wang Y, Guizilini V C, Zhang T, et al. Detr3d: 3d object detection from multi-view images via 3d-to-2d queries[C]. Conference on Robot Learning. PMLR, 2022: 180-191.
[9] Zhang R, Qiu H, Wang T, et al. Monodetr: Depth-aware transformer for monocular 3d object detection[J]. arXiv preprint arXiv:2203.13310, 2022.
[10] Huang K C, Wu T H, Su H T, et al. MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 4012- 4021.
[11] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[12] Chen X, Wang Y, Chen X, et al. S2r-depthnet: Learning a generalizable depth-specific structural representation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 3034-3043.
[13] Roh B, Shin J W, Shin W, et al. Sparse detr: Efficient end-to-end object detection with learnable sparsity[J]. arXiv preprint arXiv:2111.14330, 2021.
[14] Ma X, Zhang Y, Xu D, et al. Delving into localization errors for monocular 3d object detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 4721-4730.
[15] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]. 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 3354-3361.
[16] Li P, Zhao H, Liu P, et al. Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving[C]. European Conference on Computer Vision. Springer, Cham, 2020: 644-660.
[17] Zhang Y, Lu J, Zhou J. Objects are different: Flexible monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 3289-3298.
[18] Kumar A, Brazil G, Liu X. Groomed-nms: Grouped mathematically differentiable nms for monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 8973- 8983.
[19] Liu Y, Yixuan Y, Liu M. Ground-aware monocular 3d object detection for autonomous driving[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 919-926.
[20] Lu Y, Ma X, Yang L, et al. Geometry uncertainty projection network for monocular 3d object detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 3111-3121.
[21] Kumar A, Brazil G, Corona E, et al. Deviant: Depth equivariant network for monocular 3d object detection[C]. European Conference on Computer Vision. Springer, Cham, 2022: 664-683. 1989-07-26 (in Chinese).
Downloads: | 9416 |
---|---|
Visits: | 314008 |
Sponsors, Associates, and Links
-
Information Systems and Signal Processing Journal
-
Intelligent Robots and Systems
-
Journal of Image, Video and Signals
-
Transactions on Real-Time and Embedded Systems
-
Journal of Electromagnetic Interference and Compatibility
-
Acoustics, Speech and Signal Processing
-
Journal of Power Electronics, Machines and Drives
-
Journal of Electro Optics and Lasers
-
Journal of Integrated Circuits Design and Test
-
Journal of Ultrasonics
-
Antennas and Propagation
-
Optical Communications
-
Solid-State Circuits and Systems-on-a-Chip
-
Field-Programmable Gate Arrays
-
Vehicular Electronics and Safety
-
Optical Fiber Sensor and Communication
-
Journal of Low Power Electronics and Design
-
Infrared and Millimeter Wave
-
Detection Technology and Automation Equipment
-
Journal of Radio and Wireless
-
Journal of Microwave and Terahertz Engineering
-
Journal of Communication, Control and Computing
-
International Journal of Surveying and Mapping
-
Information Retrieval, Systems and Services
-
Journal of Biometrics, Identity and Security
-
Journal of Avionics, Radar and Sonar