Education, Science, Technology, Innovation and Life
Open Access
Sign In

Improved Parts Drawing Segmentation Method Based on U-net

Download as PDF

DOI: 10.23977/jipta.2022.050109 | Downloads: 10 | Views: 580

Author(s)

Dan Tian 1, Hui Yao 1, Zhiwei Song 1, Gaohui Zhan 1, Zhijie Wang 1

Affiliation(s)

1 School of Mechatronic Engineering, Xi'an Technological University, Xi'an, Shaanxi, 710021, China

Corresponding Author

Dan Tian

ABSTRACT

A U-net segmentation network with attention mechanism and pyramid structure is proposed for the problem of low accuracy of closed contour shape recognition of part diagram. The spatial pyramid structure is added before the UpSampling operation of the decoder module of the classical U-net network to expand the perceptual field and reduce the loss of feature details. Meanwhile, the spatial and channel attention mechanisms are added in the middle of UpSampling and convolution of the decoder to extract more significant semantic information. The comparison with the classical U-net analysis shows that this method improves the mean intersection ratio (MIoU) by 7.05%, the category average pixel accuracy (mPA) by 14.63, and Accu by 16.49%, and the experimental results verify that this method can improve the segmentation accuracy of the model in an effective way.

KEYWORDS

Part graph closed contour, Attention mechanism, U-net, Pyramid structure

CITE THIS PAPER

Dan Tian, Hui Yao, Zhiwei Song, Gaohui Zhan, Zhijie Wang, Improved Parts Drawing Segmentation Method Based on U-net. Journal of Image Processing Theory and Applications (2022) Vol. 5: 52-58. DOI: http://dx.doi.org/10.23977/jipta.2022.050109.

REFERENCES

[1] Wang P, Chen P, Yuan Y, et al. (2018) Understanding convolution for semantic segmentation. 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 1451-1460.
[2] Paszke A, Chaurasia A, Kim S, et al. (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv: 1606. 02147.
[3] Zhao H, Shi J, Qi X, et al. (2017) Pyramid scene parsing network. Proceedings of the IEEE conference on computer vision and pattern recognition. 2881-2890.
[4] Badrinarayanan V, Kendall A, Cipolla R. (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12): 2481-2495.
[5] Xiao T, Liu Y, Zhou B, et al. (2018) Unified perceptual parsing for scene understanding. Proceedings of the European conference on computer vision (ECCV). 418-434.
[6] Peng C, Zhang X, Yu G, et al. (2017) Large kernel matters--improve semantic segmentation by global convolutional network. Proceedings of the IEEE conference on computer vision and pattern recognition. 4353-4361.
[7] Chen L C, Papandreou G, Kokkinos I, et al. (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062.
[8] Chen L C, Papandreou G, Schroff F, et al. (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587.
[9] Alom M Z, Hasan M, Yakopcic C, et al. (2018) Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955.
[10] Cao X, Lin Y. (2021) Caggnet: Crossing aggregation network for medical image segmentation. 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 1744-1750.
[11] Ibtehaz N, Rahman M S. (2020) MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural networks, 121: 74-87.
[12] Mohan S, Bhattacharya S, Ghosh S. (2021) Attention W-Net: Improved Skip Connections for better Representations. arXiv preprint arXiv:2110.08811.
[13] Song Z, Yao H, Tian D, et al. (2022) CSSAM: U-net Network for Application and Segmentation of Welding Engineering Drawings. arXiv preprint arXiv:2209.14102.
[14] Zhou Z, Rahman Siddiquee M M, Tajbakhsh N, et al. (2018) Unet++: A nested u-net architecture for medical image segmentation. Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Cham, 3-11.
[15] Song Yuchen, Qiang ZUO, Zhifang WANG. (2022) Semantic Segmentation of Remote Sensing Image Based on U-Net.Radio Engineering, 52(1):168-172.
[16] Chen T, Wang H, Liu H, et al. (2020) An Island Remote Sensing Image Segmentation Algorithm Based on A Fusion Network with Attention Mechanism. Journal of Physics: Conference Series. IOP Publishing, 1693(1): 012179.
[17] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.
[18] Yu M Y, Chen X X, Zhang W Z, et al. (2022) Building extraction on high-resolution remote sensing images using attention gates and feature pyramid structure. Journal of Geo-Information Science, 24(9):1785- 1802. 
[19] Wang D, Zhang N, Sun X, et al. (2019). Afp-net: Real-time anchor-free polyp detection in colonoscopy. 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 636-643.
[20] Li Tao, Gao Zhigang, Guan Shengyuan, et al. (2022) Global attention mechanism with real time semantic segmentation network, DOI:10.11992/tis.202208027.
[21] Liu Y, Shao Z, Hoffmann N. (2021) Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv preprint arXiv:2112.05561.
[22] Chen L C, Papandreou G, Kokkinos I, et al. (2016) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis & Machine Intelligence, 40(4):834-848.
[23] Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). CBAM: Convolutional Block Attention Module. Lecture Notes in Computer Science, 3–19. doi:10.1007/978-3-030-01234-2_1.
[24] Kingma D P, Ba J. (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[25] Lin T Y, Goyal P, Girshick R, et al. (2017) Focal loss for dense object detection. Proceedings of the IEEE international conference on computer vision. 2980-2988.
[26] Li X, Sun X, Meng Y, et al. (2019) Dice loss for data-imbalanced NLP tasks. arXiv preprint arXiv:1911.02855.
[27] Chemical Engineering Progress group. (2014). Consequence Analysis Software Models Explosion Risk. Chemical Engineering Progress (10).
[28] Simonyan K, Zisserman A. (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 

Downloads: 1117
Visits: 97654

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.