Research on Optimization of Association Rules Algorithm Based on Spark
Download as PDF
Chengang Li, Yu Liu, Zeng Li
Aiming at the bottleneck of traditional association rule algorithm (Apriori), such as processing speed and computing resources, as well as the problem of accessing disk in the MapReduce computing framework on Hadoop platform. The traditional association rules are transferred to the memory based Spark computing framework, and the optimization algorithm under the framework of Spark is given. By comparing the Apriori algorithm under MapReduce, the algorithm can greatly improve the mining efficiency of the large data association rules. At the same time, the algorithm can reduce the I/O overhead when facing a large number of data. In the cluster, both the extensibility and the acceleration ratio are better than the traditional Apriori algorithm.
Association Rules, Aprior, Spark, Pruning