Research on English Text Analysis Based on TF-IDF
Download as PDF
DOI: 10.23977/ESAC2020018
Author(s)
Zhijie Qu, Yu Shu, Fangyi Yang and Yuhang Li
Corresponding Author
Zhijie Qu
ABSTRACT
The emergence and explosion of online reviews is both an opportunity and a challenge for merchants. In this paper, we described the relationship between star ratings, reviews, and help ratings, and explored trends in product reputation over time. First, we explored text processing methods in the English environment. Based on the TF-IDF idea, we filtered out the stop words in the text, calculated the word frequency, inverse document frequency and TF-IDF value, and extracted the 22 higher-ranked words as keywords. Then, we generated the set of opinion words according to the four opinion word mining rules. Finally, based on the correlation between star ratings, reviews, and help scores, we identified the factors that affect the usefulness of online reviews as Review Len, Total Votes, Different Rating, and Emotion Analysis, and verified the validity of the polynomial regression model through the correlation analysis of the influencing factors.
KEYWORDS
Online Reviews; Usefulness; TF-IDF