Research on Graph-based Text Summarization Extraction Algorithm
DOI: 10.23977/acss.2024.080603 | Downloads: 31 | Views: 990
Author(s)
Junhong Chen 1,2, Kaihui Peng 3
Affiliation(s)
1 School of Software Engineering, South China University of Technology, Guangzhou, China
2 LeiHuo Studio, NetEase, Hangzhou, China
3 Faculty of Business and Economics, University of Malaya, Kuala Lumpur, Malaysia
Corresponding Author
Junhong ChenABSTRACT
This paper proposes a graph-based text summarization extraction algorithm. The algorithm is based on directed graphs and can incorporate the position information of sentences into the computational scope. When calculating the edge weights of nodes in the directed graph, a pre-trained model after negative sampling is used, which not only can extract deeper semantic features but also enable higher relevance between the contextual sentences in the article. The algorithm also introduces a weighting mechanism to adjust the extraction priority of the sentences according to the article's theme, resulting in a higher quality of extracted summary sentences that can represent the key information of the text as much as possible. The algorithm can capture the key information in the text, reduce the impact of irrelevant information on semantics, and play a role in text compression.
KEYWORDS
Text Summarization; Keyword Extraction; Pre-trained ModelCITE THIS PAPER
Junhong Chen, Kaihui Peng, Research on Graph-based Text Summarization Extraction Algorithm. Advances in Computer, Signals and Systems (2024) Vol. 8: 13-22. DOI: http://dx.doi.org/10.23977/acss.2024.080603.
REFERENCES
[1] Mihalcea R, Tarau P. Textrank: Bringing order into text[C]//Proceedings of the 2004 conference on empirical methods in natural language processing. 2004: 404-411.
[2] Devlin J. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
[3] Cui Y, Che W, Liu T, et al. Pre-training with whole word masking for chinese bert[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 3504-3514.
[4] Abdi A, Shamsuddin S M, Idris N, et al. A linguistic treatment for automatic external plagiarism detection[J]. Knowledge-Based Systems, 2017, 135: 135-146.
[5] Nápoles G, Dikopoulou Z, Papageorgiou E, et al. Prototypes construction from partial rankings to characterize the attractiveness of companies in Belgium[J]. Applied Soft Computing, 2016, 42: 276-289.
[6] Goyal R, Dymetman M, Gaussier E. Natural language generation through character-based rnns with finite-state prior knowledge[C]//Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016: 1083-1092.
[7] Yu D, Wang H, Chen P, et al. Mixed pooling for convolutional neural networks[C]//Rough Sets and Knowledge Technology: 9th International Conference, RSKT 2014, Shanghai, China, October 24-26, 2014, Proceedings 9. Springer International Publishing, 2014: 364-375.
[8] Li B, Zhou H, He J, et al. On the sentence embeddings from pre-trained language models[J]. arXiv preprint arXiv:2011.05864, 2020.
[9] Yu Y, Wang Y, Mu J, et al. Chinese mineral named entity recognition based on BERT model[J]. Expert Systems with Applications, 2022, 206: 117727.
[10] Mahata D, Kuriakose J, Shah R, et al. Key2vec: Automatic ranked keyphrase extraction from scientific articles using phrase embeddings[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018: 634-639.
[11] Yao L, Pengzhou Z, Chi Z. Research on news keyword extraction technology based on TF-IDF and TextRank[C]// 2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS). IEEE, 2019: 452-455.
[12] Genest P E, Lapalme G. Framework for abstractive summarization using text-to-text generation[C]//Proceedings of the workshop on monolingual text-to-text generation. 2011: 64-73.
[13] Hua L, Wan X, Li L. Overview of the NLPCC 2017 shared task: single document summarization[C]//Natural Language Processing and Chinese Computing: 6th CCF International Conference, NLPCC 2017, Dalian, China, November 8–12, 2017, Proceedings 6. Springer International Publishing, 2018: 942-947.
[14] Yuan W, Neubig G, Liu P. Bartscore: Evaluating generated text as text generation[J]. Advances in Neural Information Processing Systems, 2021, 34: 27263-27277.
Downloads: | 38553 |
---|---|
Visits: | 697954 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Journal of Artificial Intelligence Practice
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks