Education, Science, Technology, Innovation and Life
Open Access
Sign In

Research and Implementation of Web Crawler Technology on Cloud Platform

Download as PDF

DOI: 10.23977/icmmct.2019.62080

Author(s)

Zijing Shi

Corresponding Author

Zijing Shi

ABSTRACT

With the continuous development of the Internet, our access to information has gradually been replaced by the network, but at the same time the amount of network information is growing at an alarming rate. In the face of such massive data, how to quickly and accurately collect the required data is the current research hotspot. At present, many companies have adopted distributed web crawling technology to improve the efficiency of crawling, and use multiple machines to crawl network data from the Internet in parallel. This paper designs and implements a distributed web crawler system built on the cloud platform, which utilizes various features of the cloud platform to improve the performance and scalability of the web crawler.

KEYWORDS

Web Crawler Technology, Cloud Platform, Implementation Way

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.