Entity Disambiguation Algorithm for Literature in Biomedical Field
Download as PDF
DOI: 10.23977/icamcs.2017.1033
Author(s)
Wang Jing, Yan Jianzhuo, Lv Ruying
Corresponding Author
Wang Jing
ABSTRACT
Based on the requirements of knowledge learning and application in the domain of biomedical, a kind of entity disambiguation algorithm is proposed to solve the problem of entity ambiguity.Entity disambiguation is usually divided into two parts: candidate generation and entity disambiguation. In this paper, candidates of name mention are generated based on the knowledge base method and candidate entities are filtered based on the rule in the candidate generation stage, which ensures the recall rate of the candidate entity set and reduces the computational complexity and noise of the disambiguation stage effectively. In the stage of entity disambiguation, we propose a kind of entity disambiguation method based on probability model, estimating the probability that an entity becomes the target entity through the language model and selecting the entity with the highest probability as the target entity. The result of the method proposed in this paper shows the accuracy rate is 83%, higher than that of other algorithms. The method of entity disambiguation proposed in this paper is the best in the field of biomedical.
KEYWORDS
Domain literature, entity disambiguation, contextual characteristics, probability model,language model