Education, Science, Technology, Innovation and Life
Open Access
Sign In

YOLOv8 with Multi Strategy Integrated Optimization and Application in Object Detection

Download as PDF

DOI: 10.23977/autml.2024.050104 | Downloads: 2 | Views: 116

Author(s)

Jiafeng Li 1, Chenxi Yan 2

Affiliation(s)

1 College of Software Engineering, Sichuan University, Chengdu, 610207, China
2 School of Computer Science, Northeast Electric Power University, Jilin, Jilin, 132011, China

Corresponding Author

Jiafeng Li

ABSTRACT

YOLOv8s model often used at Object detection which is widely used in many fields, cannot aggregate feature information well for a specific task, and the number of parameters is large and the accuracy is not high. Aiming at the above problems of traditional YOLOv8s, a new lightweight and performance-balanced YOLOv8s network structure is proposed. The C2F module in the backbone network are replaced with ShuffleNet-v2, and in order to further improve the accuracy degradation due to the decrease in the number of parameters, a global multi-attention mechanism is added to obtain global information, learn the correlation between features at different scales, and fuse them. SGD is used as an optimizer to further improve accuracy. The experimental results show that the introduction of ShuffleNet-v2 and MHSA effectively reduces the number of parameters of the model, significantly reduces the training time, and the accuracy is considerable, and compared with other optimizers SGD has the largest performance improvement, and has an excellent performance in terms of the balance between lightweighting and algorithmic performance.

KEYWORDS

Yolov8s; ShuffleNet-v2; MHSA; lightweighting

CITE THIS PAPER

Jiafeng Li, Chenxi Yan, YOLOv8 with Multi Strategy Integrated Optimization and Application in Object Detection. Automation and Machine Learning (2024) Vol. 5: 23-31. DOI: http://dx.doi.org/10.23977/autml.2024.050104.

REFERENCES

[1] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779-788.
[2] Zaidi S.S.A., Ansari M.S., Aslam A., et al. A Survey of Modern Deep Learning based Object Detection Models [J]. Digital Signal Processing, 2022: 103514.
[3] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. cvpr 2001, Kauai, HI, USA, 2001, pp. 1-1.
[4] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 580-587.
[5] Carion N., Massa F., Synnaeve G., et al. End-to-End Object Detection with Transformers [M]//Computer Vision - ECCV 2020, Lecture Notes in Computer Science. 2020: 213-229.
[6] Qin, X., Zhang, Z., Huang, C., et al. U2-Net: Going deeper with nested U-structure for salient object detection [J]. Pattern Recognition, 2020: 107404.
[7] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 6517-6525.
[8] Redmon J., Farhadi A., YOLOv3: An Incremental Improvement. [J]. arXiv: Computer Vision and Pattern Recognition, 2018.
[9] Zhang X., Gao Y., Wang H., Wang Q. Improve YOLOv3 using dilated spatial pyramid module for multi-scale object detection [J]. International Journal of Advanced Robotic Systems. 2020; 17(4).
[10] Bochkovskiy A., Wang C.Y., Liao H.Y.M., 2020. YOLOv4: Optimal Speed and Accuracy of Object Detection[J]. Cornell University - arXiv, 2020.
[11] Jiao S, Wang C, Gao R, et al. Harris Hawks Optimization with Multi-Strategy Search and Application[J].Symmetry, 2021.DOI:10.3390/sym13122364.
[12] Li C Y , Li J , Chen H L ,et al.Enhanced Harris hawks optimization with multi-strategy for global optimization tasks[J].Expert Systems with Application, 2021(Dec.):185.DOI:10.1016/j.eswa.2021.115499.
[13] Sun S., Han L., Wei J., et al. ShuffleNetv2-YOLOv3: a real-time recognition method of static sign language based on a lightweight network [J]. SIViP 17, 2721-2729 (2023).
[14] Vaswani A., Shazeer N.M., Parmar N., et al. Attention is All you Need [J]. Neural Information Processing Systems, 2017.
[15] Goyal P., Dollár P., Girshick R., et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour[J]. arXiv: Computer Vision and Pattern Recognition, 2017.

Downloads: 1628
Visits: 68505

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.