Weighted Slope One Algorithm with Integrated User Trust Factor

In view of the low accuracy of the Slope One personalized recommendation algorithm because of ignoring user trust and project similarity, a weighted Slope One algorithm that integrates the user trust factor is proposed in this work. This study considers the proportion of users’ common-score items to the number of items scored by the target users, develops user trust factor model and algorithms, uses the Pearson correlation coefficient to calculate user similarity, introduces the trust factor to modify user similarity and obtain the target users’ top-K nearest neighbor sets, and uses a modified weighted Slope One algorithm for the predictive analysis of a sample. Experiments are conducted using the MovieLens data set. Results show that the proposed method improves the accuracy of prediction and effectively improves recommendation accuracy.


Introduction
With the rapid development of e-commerce, the numbers of product categories and users in large-scale e-commerce systems have increased dramatically.The number of users is often much higher than that of products, but the users' rated products generally do not exceed 1% of the total number of products [2], so the user-item rating matrix is extremely sparse.Data sparsity results in the low efficiency of recommendation algorithms and inaccurate recommendations, which are typical problems currently faced by CF algorithms [3].In real life, two people who have similar interests may have varying trust in an item; consequently, their acceptance of a recommendation will differ.In other words, besides similarity, trust is an important factor influencing a person's decision-making process.The effective integration of user trust relationships into personalized recommendations is essential to improving the quality of recommendations.In fact, the effective integration of user trust relationships into personalized recommendations has become a hot topic in the field of recommendation systems [4], and many methods can be used for reference.
In 2016, Lu et al. proposed an implied trust-aware CF algorithm that can realize accurate personalized recommendation by mining potential trust relationships, such as user preferences and activities [5].Li et al. proposed a CF recommendation algorithm combined with user trust; the algorithm can effectively improve the accuracy of recommendation by combining the rating trust and preference trust between users [6].A CF algorithm based on a trust factor was proposed by Guo et al., who established a trust model on the basis of the number of users evaluated and the number of times recommended for others [7].Although these methods improve recommendation accuracy to a certain extent, they cannot overcome problems well in real-time systems of massive data, and the time complexity is high.To solve problems in real time, Lemire and Maclachlan proposed the Slope One algorithm in 2005 [8].This algorithm is thus far the most concise form of CF algorithm that is based on item evaluation.It has the advantages of easy implementation, high efficiency, good expansibility, and low algorithmic complexity.However, it considers neither the similarity and the mutual trust relationship between users nor the possible internal relationship between items.It considers only the average deviation of items, thereby resulting in low recommendation accuracy.To solve these problems, a weighted Slope One algorithm with integrated user trust factor is proposed in this study.The proposed algorithm considers the proportion of the number of users' common-score items to the number of items scored by the target users, designs the user trust factor model, calculates the user similarity by using the Pearson correlation coefficient, and introduces the trust factor to modify the user similarity.The top-K nearest neighbor set of the target user is obtained, and the improved weighted Slope One algorithm is used to predict and analyze the sample to improve the accuracy of prediction.

Introduction to the Slope One Algorithm
The Slope One algorithm considers that a linear relationship exists between the user rating and the item and uses the linear regression method to predict the score.The prediction formula is expressed as , where the parameter v is the historical score generated by the target user, and the parameter b is the average difference between the different items' ratings.For the user-item rating matrix, the mean deviation for the different items i and j is defined as follows: ( ) Where ij dev represents the average deviation of items i and j in the rating matrix; ij U represents the set of users who have scored items i and j ; ui r represents the rating of item i by user u ; and uj r represents the rating of item j by user u ; and represents the number of users in the set ij U .
The Slope One algorithm uses to predict the score of item j by user v .In general, a user may have more than one rated item; thus, all the forecasts are averaged.The final prediction value can be obtained as follows: Where vj p denotes the prediction score of user v for unrated item j ; vi r denotes the rating of item i by user v ; ij dev denotes the average score deviation of items i and j ; j R denotes the user's set of v rated items; and denotes the number of users in the set j R .

Weighted Slope One Algorithm
The Slope One algorithm does not consider the number of items scored by the user; consequently, the more ratings are available, the more accurate the prediction will be.For example, 1000 users rate items i and j , and only 10 users rate items i and k .Thus, the average score deviation ij dev is more convincing than ik dev .To address this issue, the weighted Slope One algorithm was proposed in Reference [8].This algorithm uses the following prediction formula (3): Where ij c is the weight, and is the number of users who jointly evaluate items i and j ) ( j i ≠ .

Similarity Measure
Many measures have been used to determine similarity, including Pearson's correlation coefficient, Euclidean distance, and cosine similarity.Considering the complexity of the algorithm and the size and characteristics of the data, the present study uses the Pearson correlation coefficient and cosine similarity to design a similarity measure model.

Pearson Correlation Coefficient
The Pearson correlation coefficient is a basic measure for calculating the similarity of vectors; the linear correlation between two involved vectors is computed to measure the degree of similarity between the two [9].The Pearson correlation coefficient between two users is defined as follows: represents the Pearson correlation coefficient between users u and v , and its range of values is [−1, 1].The closer the value is to 1, the higher the similarity; conversely, the closer the value is to −1, the lower the similarity.In formula (4), uv T represents a set of items that users u and v score together; u r represents the average of the item scores or ratings of user u ; and v r represents the average of the item scores of user v .

Cosine Similarity
The cosine similarity is based on the user's score vector.The cosine of the included angle between the two vectors is calculated to measure the degree of similarity between the two vectors.The cosine similarity between two users is defined as follows: However, both similarity measures have several drawbacks.
(1) Misjudgment may easily occur when two users jointly score items that are small and close; (2) These traditional metrics measure similarity between two users.This process is inadequate and should be differentiated; (3) Trust is also an important factor influencing a person's decision-making process and should thus be reflected in the measurement method.

Weighted Slope One Algorithm with Integrated User Trust Factor
To solve the three problems mentioned in the previous section, this study improves the weighted Slope One algorithm with use of the Pearson correlation coefficient, designs an improved model of the user trust factor, and proposes a weighted Slope One algorithm with integrated user trust factor.

Trust Factor
The traditional user trust relationship is equal, that is, user u trusts user v , and user v also trusts user u .However, in real life, user u has a high degree of trust to user v , and often, user v also trusts user u ; but user v does not necessarily have a high trust degree in user u .As shown in Table 1, for user v , user u has the same rating information for items 2 and 5 (I 2 and I 5 in the table, respectively), and user v can be considered to have a high degree of trust in user u , but thinking that user u has a high degree of trust in user v is not necessarily reasonable.The fact that users u and user w rate similarly means they can be regarded as having the same level of trust.
In this study, for the two rating users u and v , the trust degree of user u in user v is measured on the basis of the proportion of their common-score items to the number of items rated by the target users.This degree of trust is called the trust factor and has the range [0, 1].The trust factor is defined as follows: Where ( ) v u w , indicates the degree of trust that user u has in user v ; u T and v T represent the number of items evaluated by users u and v , respectively.
The number of users' common-score items is one of the important variables affecting the trust between users.The larger the number of the users' common-score items, the higher the degree of trust between the users.Similarly, the participation of users is also an important factor that affects the degree of trust between users.The higher the participation of a user is, the more easily he/she can obtain the trust of other users.Therefore, in this study, the number of users' common-score items is taken into account in the proportion of the two users' scored items to improve the model for user trust factor, which is defined as follows: The improved user trust factor is introduced to modify the traditional user similarity measures.1) Pearson correlation coefficient after incorporating the user trust factor 2) Cosine similarity after incorporating the user trust factor

Description of Weighted Slope One Algorithm with Integrated User Trust Factor
In this study, methods based on the similarity threshold and K value are used to validate the similarity measures with the user trust factor and the prediction performance of the proposed algorithm.The traditional threshold-based method is insufficiently flexible in repeatedly adjusting the threshold value of similarity when the nearest neighbor is selected.A dynamic threshold optimization scheme comprising two algorithms is proposed to solve this problem.The scheme dynamically calculates the average similarity of all the users whose similarity is greater than 0 in the nearest neighbor set of the target users (K-value algorithm) and selects the mean as the threshold (dynamic threshold algorithm).The algorithms are as follows.
Input: User-item rating matrix n m R × , target user u , target item i Output: Predicted score ui p of target user u for item i

K-value Algorithm
Step 1. Initialize the user-item rating matrix, target user, and target item.
Step 2. If target user u has evaluated at least one item and evaluated item j with other users, then set the number of nearest neighbor users K, calculate the similarity ) , ( _ v u sim AWPCC between target users u and v using formula (8), and perform reverse order processing with the first K as the nearest neighbor user set Step 5. End.user trust factor (AWCOS Slope One).By contrast, the dynamic threshold algorithm is used in the weighted Slope One algorithm that is based on the Pearson correlation coefficient with the user trust factor (DTPCC Slope One).The numbers of users' neighbors (K) are 5, 10, 20, 40, 80, and 160.After the predicted values are obtained, the MAE (Fig. 1) and RMSE (Fig. 2) are calculated and compared.
Fig. 1 MAE-K Fig. 2 RMSE-K As shown in Figs. 1 and 2, with an increase in the number of nearest neighbors (K), the prediction results initially increase and then stabilize after K=40.In view of the sparse data and the distribution of the user's neighborhood, K increases to a certain extent, and the accuracy tends to stabilize or decrease.Apparently, the performances of the AWPCC Slope One and DTPCC Slope One algorithms are improved, indicating that the introduction of the user trust factor has a significant effect on the improvement of prediction accuracy.The DTPCC Slope One algorithm has the best performance, and the AWPCC Slope One algorithm is better than the PCC Slope One and AWCOS Slope One algorithms.On the basis of the results in Experiment 1, the DTPCC Slope One algorithm and the AWPCC Slope One algorithm are selected for Experiment 2.
Experiment 2. The two selected versions of the weighted Slope One algorithm with integrated user trust factor are compared with the following existing Slope One algorithms to evaluate the prediction performance of the proposed algorithm: (1) Integrating User Similarity and Item Similarity into Weighted Slope One Algorithm[11] (Algorithm 1); (2) Integrating Item Relevance into Weighted Slope One Algorithm [12] (Algorithm 2) The comparison of the prediction accuracy levels of the four algorithms is shown in Figs. 3 and 4.
Fig. 3 MAE-K Fig. 4 RMSE-K As shown in Figs. 3 and 4, the two versions of the weighted Slope One algorithm with integrated user trust factor are superior to the Algorithm 1 and the Algorithm 2. In particular, the DTPCC Slope One algorithm proposed in this work is highly superior to the three other algorithms.Nevertheless, the AWPCC Slope One algorithm still performs better than do the two other algorithms, obtaining an MAE of only 0.734 when K=40.The MAE of the DTPCC Slope One algorithm is 0.724, which is the lowest among the MAE of all compared algorithms.For a 5-point evaluation system, MAE=0.73 is generally a remarkable score that cannot be easily surpassed [13].

Conclusion
To address the problems of the Slope One algorithm, this work proposes a weighted Slope One algorithm integrated with a trust factor model.In this study, a user trust factor model is designed, the Pearson correlation coefficient is used to calculate user similarity, the trust factor is introduced to modify the user similarity, and the established weighted Slope One algorithm is used to predict and analyze a sample through experiments.The experimental results show that the improved trust factor algorithm is feasible, and the recommendation quality of this algorithm is remarkably improved in the case of sparse data.In the future, we will integrate the proposed algorithm with machine learning algorithms, incorporate the concept of artificial intelligence, and further improve the algorithm to obtain increasingly accurate and efficient recommendation performance.

3 .
output user u evaluates the average value of the item score and ends.Step Use formula (3) to calculate the predicted score ui p of target user u for item i .Step 4. Repeat Step 2, with K assuming the values of 5, 10, 20, 40, 80, and 160 chronologically.