Research and Implementation of Web Crawler Technology on Cloud Platform
Download as PDF
DOI: 10.23977/icmmct.2019.62080
Corresponding Author
Zijing Shi
ABSTRACT
With the continuous development of the Internet, our access to information has gradually been replaced by the network, but at the same time the amount of network information is growing at an alarming rate. In the face of such massive data, how to quickly and accurately collect the required data is the current research hotspot. At present, many companies have adopted distributed web crawling technology to improve the efficiency of crawling, and use multiple machines to crawl network data from the Internet in parallel. This paper designs and implements a distributed web crawler system built on the cloud platform, which utilizes various features of the cloud platform to improve the performance and scalability of the web crawler.
KEYWORDS
Web Crawler Technology, Cloud Platform, Implementation Way