Education, Science, Technology, Innovation and Life
Open Access
Sign In

Detecting Malicious Domain Name Based on the Web Page Structure Similarity

Download as PDF

DOI: 10.23977/CNCI2020066

Author(s)

Xiaoyan Liu, Yue Shi, Yanan Cheng, Haiyan Xu, Zhaoxin Zhang

Corresponding Author

Xiaoyan Liu

ABSTRACT

In order to detect the malicious domain name accurately, a method of detecting the malicious domain name based on the similarity of web page structure of Web document object model is proposed. The key is how to calculate the hierarchical similarities among web page structures quickly and effectively. This method first obtains the source code of the domain name and analyzes its DOM tree structure, constructs the DOM tree level tag attribute name sequence to describe the characteristics of the domain name's web page structure, and then defines the DOM tree distance based on the idea of Simhash algorithm to measure the similarity between the web page structures. Experiment shows that the method can detect the similarity of domain page structure effectively with high accuracy and recall rate.

KEYWORDS

Document object model; web page structure; Simhash algorithm; hierarchical similarity

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.