Education, Science, Technology, Innovation and Life
Open Access
Sign In

Research on English Text Analysis Based on TF-IDF

Download as PDF

DOI: 10.23977/ESAC2020018

Author(s)

Zhijie Qu, Yu Shu, Fangyi Yang and Yuhang Li

Corresponding Author

Zhijie Qu

ABSTRACT

The emergence and explosion of online reviews is both an opportunity and a challenge for merchants. In this paper, we described the relationship between star ratings, reviews, and help ratings, and explored trends in product reputation over time. First, we explored text processing methods in the English environment. Based on the TF-IDF idea, we filtered out the stop words in the text, calculated the word frequency, inverse document frequency and TF-IDF value, and extracted the 22 higher-ranked words as keywords. Then, we generated the set of opinion words according to the four opinion word mining rules. Finally, based on the correlation between star ratings, reviews, and help scores, we identified the factors that affect the usefulness of online reviews as Review Len, Total Votes, Different Rating, and Emotion Analysis, and verified the validity of the polynomial regression model through the correlation analysis of the influencing factors.

KEYWORDS

Online Reviews; Usefulness; TF-IDF

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.