Image Semantic Segmentation Model based on AsppUNet
DOI: 10.23977/jipta.2025.080114 | Downloads: 4 | Views: 244
Author(s)
Qian Guo 1, Yanlong Xu 1, Limin Sun 1
Affiliation(s)
1 School of Information and Intelligent Engineering, University of Sanya, Jiyang, Sanya, China
Corresponding Author
Qian GuoABSTRACT
In this paper, we propose AsppUNet, an image semantic segmentation model based on the Atrous Spatial Pyramid Pooling(ASPP) module, to address the issue that smaller objects are prone to being overlooked during the segmentation process. Instead of using the standard pooling layers in the encoder of UNet, our model adopts a series of atrous convolution layers with progressively increasing dilation rates to reduce feature loss caused by traditional pooling operations. The ASPP module is constructed by cascading atrous convolution layers with different dilation rates, and is integrated into the decoder of UNet to aggregate multi-scale feature maps and capture multi-level contextual information. Experimental results demonstrate that AsppUNet achieves superior segmentation performance on objects of various sizes. It improves the mIoU for objects at different scales on the CamVid dataset, and effectively enhances the overall segmentation accuracy.
KEYWORDS
Aspp, UNet, Atrous Convolution, Pyramid Module, CamVid DatasetCITE THIS PAPER
Qian Guo, Yanlong Xu, Limin Sun, Image Semantic Segmentation Model based on AsppUNet. Journal of Image Processing Theory and Applications (2025) Vol. 8: 112-121. DOI: http://dx.doi.org/10.23977/jipta.2025.080114.
REFERENCES
[1] Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention [J]. Advances in Neural Information Processing Systems, 2014,3.
[2] Yu F, Koltun V. Multi-scale context aggregation by atrous convolutions [J]. arXiv preprint arXiv:1511.07122, 2015.
[3] Wang P, Chen P, Yuan Y, et al. Understanding Convolution for Semantic Segmentation [C]. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018: 1451–1460.
[4] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector [C]. In European conference on computer vision, 2016: 21–37.
[5] Deng L, Yang M, Li T, et al. RFBNet: Deep Multimodal Networks with Residual Fusion Blocks for RGB-D Semantic Segmentation [J]. CoRR, 2019, abs/1907.00135.
[6] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37 (9): 1904–1916.
[7] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories [C]. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), 2006: 2169–2178.
[8] Lin T Y , Dollar P , Girshick R ,et al.Feature Pyramid Networks for Object Detection[J].IEEE Computer Society, 2017.DOI:10.1109/CVPR.2017.106.
[9] H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia, "Pyramid Scene Parsing Network," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 6230-6239, doi: 10.1109/CVPR.2017.660.
[10] Qian G ,Yanlong X .Image semantic segmentation model based on CBAMUNet[C]//University of Sanya (China),2024:
[11] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation [C]. In Proceedings of the IEEE international conference on computer vision, 2015: 1520–1528.
[12] Visin F, Ciccone M, Romero A, et al. Reseg: A recurrent neural network-based model for semantic segmentation [C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016: 41–48.
[13] Chen L C, Papandreou G, Kokkinos I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs [J]. Computer Science, 2014 (4): 357–361.
[14] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions [J]. arXiv preprintarXiv:1511.07122, 2015.
[15] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37 (9): 1904–1916.
Downloads: | 2455 |
---|---|
Visits: | 172057 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Journal of Artificial Intelligence Practice
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks