Education, Science, Technology, Innovation and Life
Open Access
Sign In

Research on Optimization of Association Rules Algorithm Based on Spark

Download as PDF

DOI: 10.23977/meet.2019.93771

Author(s)

Chengang Li, Yu Liu, Zeng Li

Corresponding Author

Yu Liu

ABSTRACT

Aiming at the bottleneck of traditional association rule algorithm (Apriori), such as processing speed and computing resources, as well as the problem of accessing disk in the MapReduce computing framework on Hadoop platform. The traditional association rules are transferred to the memory based Spark computing framework, and the optimization algorithm under the framework of Spark is given. By comparing the Apriori algorithm under MapReduce, the algorithm can greatly improve the mining efficiency of the large data association rules. At the same time, the algorithm can reduce the I/O overhead when facing a large number of data. In the cluster, both the extensibility and the acceleration ratio are better than the traditional Apriori algorithm.

KEYWORDS

Association Rules, Aprior, Spark, Pruning

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.