Automatic Mining Method for Heterogeneity Features of Prose-Chinese Translation Corpus Based on Artificial Intelligence

DOI: 10.23977/infkm.2023.040105


Yanhua Ma 1


1 Zhejiang Yuexiu University, Shaoxing, Zhejiang, 312000, China

Yanhua Ma


With the diversification of culture and the universality of language, prose as an important literary material has also attracted more scholars' attention. At the same time, due to the current integration and development of science and technology and culture, corpus, as a large-scale electronic text library, is of great significance to the study of relevant language theories. However, after studying the heterogeneity characteristics of prose Chinese translation corpus, it was found that there were still some problems in the current automatic mining methods of heterogeneity characteristics. In order to solve this problem, this paper proposed a new method based on artificial intelligence (AI) to automatically mine the heterogeneity features of prose Chinese translation corpus. In order to verify the effectiveness of this method, this paper also conducted an empirical study. The research results showed that the method in this paper could increase the weight coefficients of heterogeneity features from dataset 1 to dataset 6 in the corpus by 57, 34, 28, 36, 16, 13 respectively, and effectively reduce the offset of dataset nodes and increase the mining amount of data node access, thus improving the effectiveness and practicability of the automatic mining method. In addition, the research of prose meaning corpus could also enrich the research content of corpus, and broaden the research scope of corpus, so as to promote its better development.


Chinese Prose Translation Corpus, Artificial Intelligence, Heterogeneity Features, Automatic Mining


Yanhua Ma, Automatic Mining Method for Heterogeneity Features of Prose-Chinese Translation Corpus Based on Artificial Intelligence. Information and Knowledge Management (2023) Vol. 4: 32-44. DOI:


