Research on the Production Strategy of Mobile Short Videos Based on User Comments --Take metaverse blogger Liu Yexi as an example

: Based on the first level of user comments in first short video of “Liu Yexi”, we use text analysis, combining three aspects of text sentiment analysis, text word frequency statistics and text topic clustering extraction, to explore the focus of user attention and summarize the reasons for the popularity of this video and provide ideas for future content production in the short video industry. The study found that users' emotional attitude toward this video was positive, and the focus of attention was on six main aspects: technical threshold, original plot, multi-dimensional sensory experience, personalized features of the main character, copywriting and label positioning, cultural aesthetics and value orientation. According to the word frequency statistics and thematic clustering results, the study concludes that new ideas for short video content production include: exploring the deep integration of "technology+", shaping a personalized virtual IP image, deepening original script creation, polishing adaptable copywriting content, and contributing to the return of traditional aesthetic elements.


Introduction
Recently, the concept of "Metaverse" has attracted a lot of attention, with Zuckerberg changing the name of Facebook to "Meta", followed by Microsoft, Tencent, Byte Jump and other technology giants entering the game. The first metaverse blogger -Liu Yexi also appeared in the short video industry. As a result of the rise of the "metaverse", its content design and presentation are largely influenced by this concept.
As early as 1992, author Neal Stephenson's science fiction novel Snow Crash first introduced the concept of Metaverse, which suggested that humans could live in a virtual three-dimensional space through their digital avatars, and this space was the metaverse. [1] Later, Second Life, released in 2003, shaped the first phenomenal virtual world, attracting a large number of companies and institutions. Later, the "Oasis" in the 2018 hit movie " Ready Player One" could also be considered as a predecessor of this concept. In brief, the metaverse is an artificial space parallel to the real world, an Internet application and social form of new narrative integration generated by the integration of a variety of new technologies, [2] which can realize the multi-dimensional expansion of human senses. Unlike the immersion path in the mobile Internet era, the metaverse emphasizes the superimposed path, expecting to break the roll-in development of platform form, build multi-dimensional spacetime, and bridge the boundary between the real and the virtual, so as to truly realize virtual interactive narratives [3,4] .
From the above concepts, the metaverse is the next ideal stage in the development of the Internet. [5] Paul Levinson, a scholar of the media environment school, has said that media evolution is a dynamic and continuous process in which media evolve toward humanization and become more responsive to human needs through the mutual remediation of technologies under the influence of multiple conditions, including human and social environments. [6] At the same time, the existing media content will be adjusted in order to meet the new media form. With the change of the times and classical theories, the relationship between media technology and content is not exactly the "media as message" as McLuhan [7] said, but should be organically linked. As some scholars have emphasized in their studies, it is a one-sided and isolated approach to talk about technology and content in isolation from each other. [8] In short, media technology affects content production and vice versa. Therefore, under the influence of "metaverse", a product form that integrates multiple technological imaginations, what new demands and expectations will users have for the consumption of short-form video content? How will the content production in the short video industry change as a result? What new creative ideas can the success of Liu Yexi video offer to short video producers?
In response to these questions, this study aims to explore, through the perspectives on media environment school, how the development of media technology affects users' demand for short video content and what new ideas short video content producers should have in the era of rapid iteration of media technology.

A review of the literature on short video content production
Mobile short video with its short, flat, fast advantages gradually be the part of people's lives. But with its development of maturity, short video did not avoid the "trap" of disorderly development, industry disorders, content misconduct and other problems have emerged. Li Yongjian, director of the Internet Economy Research Office of the Chinese Academy of Social Sciences, has said that after the last round of explosive user growth, high-quality content has become the key to short video platform competition now.
In terms of available research results, the content production of short video has always been the focus of academia and industry, with the number of articles published growing year by year from 2016 to present, and the scope of research extensively includes various aspects such as models, characteristics, dilemmas, paths, mechanisms and strategies. In addition to the research on the overall level, the research on content production in niche areas, such as short video of government, information, mainstream media, beauty, science and popularization, is also quite rich.
From the specific research content, the existing literature focuses more on the mechanism and characteristics of content production, as well as the dilemma and strategy analysis. Some scholars made a survey of short video content production in the era of intelligent communication and summarize its mechanism as three: user-oriented, technology-based, and "content is king", [9] arguing that producers increasingly value the visual impact and emotional stimulation of the created content, and the production characteristics tend to be cross-media operations, community interaction, [10] and gradually highlight the social record function and aesthetic art value. [11] With regard to the dilemma of content production, scholars generally agree that content homogenization, pan-entertainment, lack of gate-keeping, capital control and technological alienation are common problems that need to be regulated and rectified, while the lack of copyright awareness among producers has been a concern in recent stages. [12] Also, it is believed that the content ecology development of short videos needs to strengthen the value orientation of content in the future, and the content itself should be embedded in the chain of people's daily life, [10] and deeply explore the scene communication. At the same time, the personality characteristics in short videos should be strengthened to maximize the display of personality elements. [13] The communication strategy of short videos should be optimized with the logic of "real, fast and heavy" and "concentrated emotion". [14] In addition, strategies such as improving users' media literacy, strengthening the supervision of relevant departments, policy formulation, and improving the supervision mechanism of platforms are also repeatedly mentioned by scholars.
Overall, the research on short-form video content production in academia and industry continues to grow in enthusiasm. Although the content is wide and the results are rich, the process and method of output research results are relatively single, mostly qualitative in nature, with few studies linking content production and content consumption, and using a combination of quantitative and qualitative methods from the perspective of user behavior. It can be seen that the current situation of insufficient application of interdisciplinary research methods in the field of mobile short video content production provides sufficient research space for this paper.

Research Method
The overall study uses a combination of qualitative and quantitative methods. Firstly, user comments are crawled and cleaned, and the valid comments with substantial content that can be analyzed are retained to ensure that the subsequent analysis results are scientific and referable. Then, we analyzed the sentiment of the comment text through NLP to obtain the attitude of users towards the video. [15] Then, the word frequency statistics of the comments which have been divided into words were conducted, and the LDA model is used to extract the topic clusters of user comment words. Finally, based on the quantitative research results, a qualitative approach is adopted to read and understand the high-frequency words in relation to the meanings of the words themselves and their contexts, summarize the main concerns of users based on thematic clustering, and explore their psychological needs and expectations in the process of media exposure, so as to sort out the development ideas that can be referred to in the content production of short videos in the future.

Data collection and pre-processing
The research data in this paper comes from Tik Tok. As the most representative platform in the short video industry at present, Tik Tok has a certain research value because of its inclusive content, mature business model, large user scale and strong influence.
In terms of data scale, this study collected all first-level user comments from the first video of metaverse blogger Liu Yexi. Second-level comments are generally discussed for the first-level comments themselves, which easily deviate from the original video content, so they are not included in the study. The study collected comment data up to March 31, 2022, so it is important to note here that the data included in the study is a static sample and does not take the dynamic changes in comment information on the platform into account.
In order to avoid the influence of redundant information on the study results, this study cleaned the comment data by eliminating, for example, @other users, emojis, comments with few words and illegal characters, etc., and extracted the valuable text in the comments. The cleaned data are presented in the form of an excel list, including the actual content of each comment, the cleaned content, the comment time, the comment length, the number of likes, the number of follow-up comments, etc. The total number of valid data is 12209.

Operation process
The research process involves the following three main steps: firstly, data collection. A Python program is written to build a web proxy server using the fiddler tool, and we simulates user operations the whole process to collect the raw comment data in a non-invasive way. Secondly, data preprocessing. The invalid and duplicate data were initially cleaned, and then the contents with no analytical value were further deleted to obtain valid data text. Thirdly, data analysis. This process contains three parts: sentiment tendency analysis, comment word frequency statistics and topic clustering extraction.

Analysis of user sentiment and attention focus based on comment text
After completing the collection and pre-processing steps of user comment text, this paper carries out further analysis to show the deep-seated emotion hidden in user comments through visual presentation, providing reference and help on related services for platforms, markets, media, and making relevant suggestions based on high-frequency keyword clustering.

User sentiment is biased towards positive
Text sentiment analysis refers to the process of analyzing, processing, summarizing and reasoning about subjective text with emotional overtones using natural language processing and text mining techniques, [16] aiming to extract information units that have substantial meaning and to mine deep bipolar attitudes through this descriptive information.
In this study, user sentiment is designated as three levels, with "0" indicating negative sentiment, "1" indicating neutral sentiment, and "2" indicating positive sentiment. Then all the cleaned comments (12,209 comments) were categorized and their confidence levels were calculated. The confidence level is used to indicate the accuracy of the statistical value of the sample, and the commonly used confidence level is 95%, which means the error of judgment is controlled within 5%. The statistical results are shown in the nested pie chart (Figure 1). The inner layer of the pie chart represents the proportion of sentiment tendency of all comments, and the outer layer represents the proportion of sentiment tendency of all comments (6,587 in total) with confidence level greater than and equal to 95%.
The statistical graph shows that the users' emotional attitude toward the video is positive mostly, accounting for 63.1%. According to the calculation results of applied mathematics based on systems engineering theory if the number of people holding a certain viewpoint reaches 61.8% within a certain range, it is considered as a quantity that can be controlled globally. [17] Thus, the statistical results indicate that most users believe that a certain part of the video satisfies their needs or motivations for use, as explained in the theory of "use and satisfaction", when specific needs and motivations are "satisfied" in the user's media exposure activity, their behavior will be influenced by the outcome of that exposure, So most users will reflect their expectations after watching this video by "following" or "commenting" to express their impressions of the video and interact with others.

High-frequency words present user expectation attitude as a whole
Word frequency statistics were initially widely used in graphology to help gain insight from a large number of texts. Using word frequency analysis can describe and predict the trends of things described by texts [18] and interpret the behavior of human society. [19]  Technology 38 This part is based on the split text obtained by splitting the valid data after pre-processing, and more than 3700 valid words were sorted out using word frequency analysis, and only 30 highfrequency words were selected to be presented below ( Table 1). According to the words in the table, we can see that the words "cool" and "update" appear most frequently, revealing that most users have a positive attitude toward the short video and have high expectations for the subsequent development of the plot. Words such as "ceiling", "rise" and "expect" are also high frequency expressions reflecting users' attitude. Secondly, the keywords "advanced", "blockbuster " and "special effect" show that the video's production technology is more attractive to users. In terms of video elements, words such as "virtual", "character", "real people " and "makeup" appear frequently, indicating that the characters and their characteristics in the video are of great interest to users. In addition, the term "metaverse", which also appears frequently, means that the tag words in the video copy can influence users' interpretation of the video content, thus creating a connection and interaction with them.

User attention focuses on six types of topics
The LDA model considers that a document can contain multiple topics, and each word in the document is generated from one of the topics. [20] Therefore, in this study, the comment document after word separation processing were used as the data source, and the programming language was written to run in the Python environment. Finally, six topics were determined after iterative adjustment of various parameters. Since some of the words have multi-layered meanings and complex usage context, manual screening is performed again on the basis of the theme model, and 15 topic keywords are selected for presentation under each topic, resulting in the final topic classification results ( Table 2). ancient costume, national style, crane, value, mainstream, civilization, Tang Dynasty, ancient style, aesthetics, China, Japan, foreign, tradition, culture, world view According to the table, the text of user comments on this short video can be divided into the following 6 topics: (1) technical threshold. From the high frequency of "cool", "advanced", "ceiling" and other keywords, we can see that users are deeply impressed by the professional production technology in the short video, and think that the special effect presentation of this short video is comparable to blockbuster. (2) Original plot. The keywords in this theme indicate that users find the plot attractive (such as "interesting", "exciting", "wonderful", etc.) and are looking forward to the sequel of the plot. (3) Multi-dimensional sensory experience. Combining keywords and specific source texts, we find that users value the overall experience of browsing videos, whether the soundtrack, lines, or the presentation of scenes, picture quality and style will affect their evaluation. For example, the "Shanhai jing" background music used in this video is loved by most viewers, with typical comments such as "I give you like for this background music", which shows that the content producers should try to take users' visual and auditory multi-sensory experience into account. (4) Personalized features of the main character. Keywords such as "makeup", "good-looking" and "eyeliner" reflect that the users are more concerned about the characteristics of the character in the video, including appearance, dress, action and even makeup details. (5) Video copy and label positioning. According to the keyword distribution in this theme, we can find that the content of the video copy and tag words have appeared in the user comments, guiding people's attention and the direction of the hot topic. There are many comments like "I thought it was a real person at first, but I didn't think it was a virtual idol", which shows that the function of copy and tags is not only to attract attention, but also to play a pivotal role in the coding of the creator and the decoding of the user. (6) Cultural aesthetics and value orientation. The theme highlights traditional cultural elements (such as "crane" and "ancient costume") and value guidance (such as "mainstream" "world view" "China", etc.). For example, a user comment received a lot of praise -"The author is very detailed, the crane used in the last scene is the traditional Chinese crane, not the Japanese crane, which means that the author is either very good at this or has done study on it."

Short video production and optimization strategy
Based on the above research results, the following will sort out the new development ideas for content producers in the short video industry, to help them capture the focus of users' attention and meet their needs.

Exploring "technology +" deep integration to break through the homogeneity barrier
In the short two-minute video of Liu Yexi, real and unreal scenes are cross-staged, real people and virtual idols talk to each other at the same time, Cyber and ancient styles are perfectly integrated. From the frequent keywords, it is easy to find that these difficult "technology + scenes" "technology + characters" use is amazing, which is highly favored by users. Looking back at the previous "popular" short video, the technical threshold is relatively low, most of them are easy to imitate, while from the technical level alone, it is not easy to do the same as Liu Yexi's video, which makes it a unique place in many contents. Prof. Xiaohong Wang has mentioned in her speech, "the development of short video is a continuous process of combining with new technologies. At a time when new technologies such as 5G and AI continue to burst with endogenous power and the metaverse boom is on the rise, cultivating Internet thinking, actively learning new technologies and exploring the deep integration of technology and content production can be said to be the only way for short video producers to break through the homogenization barrier in the future.

Deepening original script creation and multi-dimensional enhancement of video quality
The essence of the scarcity of quality content is the lack of innovation, original content from conception to finally presentation takes a long time, which seems to be a very low-cost option in the era of fast-food consumption, so many short video producers prefer to imitate and rub the hot topic. However, after the momentary traffic aura is removed, the user retention is not optimistic. In the long term, they gradually falls into the deadlock of lifeless content production. Based on the keyword research in the previous article, words such as "plot", "lines" and "music" appear frequently, which shows that an original script with conflict and suspense is an important prerequisite for user retention in short videos. In view of the length of short videos, suitable music and attractive lines can also enhance the overall quality of the video and quickly gain the audience's attention. Therefore, in the "second half" of the short video competition, quality is better than quantity, and the way to break through is to cultivate original scripts.

Shaping a personified virtual ip image to bridge the boundary between front and backstage
Although there have been successful cases such as Hatsune Miku and Ayayi, it is foreseeable that virtual idols will still be the blue ocean market waiting to be tapped, driven by the metaverse concept. At the root, virtual idols are not only a core part of the metaverse, but as the Z-era grows into the mainstream of commercial society, virtual idols with "personality" and a non-collapsing persona will be more relevant to consumers. At the same time, the people behind the virtual image are less directly scrutinized by the public, so they can perform more easily and comfortably behind the scenes, expressing their inner real character with the virtual image that carry the imagination of their outer appearance, and maintaining the emotional interaction with users with the expression of their inner authenticity, thus bridging the boundary between front and backstage in Goffman's eyes. Although many people currently criticize the possibility of metaverse, all the questions and concerns are not unreasonable. But in any case, the concept has indeed accelerated the arrival of the virtual human era, as evidenced by the appearance of the variety show "2060" and the virtual host "Xiaoyang". Virtual humans give an entry point for the metaverse to move from concept to reality, it will be an irresistible trend in the future for virtual image to break through the second dimension and move towards the popularization.

Polishing adaptable copywriting content and clear positioning to achieve business transformation
The copy in short videos is generally simple and concise, the main purpose is to highlight the video positioning and the core content, to provide the finishing touch to the video itself. For example, the keywords in the copy appear in the user comments many times, and the three clear labels of "virtual idol", "metaverse" and "beauty" on the one hand can enable viewers to quickly capture the kernel of the plot and guide their "decoding" angle of the content. On the other hand, it can lay the groundwork for the subsequent business transformation mode such as beauty brand cooperation endorsement and live-streaming e-commerce. In addition, the video copy says "Now, the world I see, you see it too", which also implies a metaphorical meaning. This phrase is not only the words spoken by the main character Liu Yexi to the boy in the episode, but also to every viewer who is watching the video in front of the cell phone, thus conveying to the audience the social form of the concept of the metaverse --a world superimposed on reality. The addition of this wonderful copy for a short video is huge, so content producers should polish the copy and expand the innovation on the basis of video content.

Contributing to the return of traditional aesthetic elements and incorporating mainstream value into new expressions
Unlike the mainstream secondary virtual idols on the market, Liu Yexi's appearance is in accordance with the current trend of national trend, integrating many elements of national style in one. The keywords "crane", "ancient costume" and "Tang Dynasty makeup" all carry the cultural heritage and connotation of the Chinese nation and show the unique wisdom of the Chinese people. This is also the underlying reason for the video's popularity. In the video, Liu Yexi confidently and firmly refutes "ugly?" when confronted by others' question, reflecting a strong sense of cultural identity, it is powerful and unobtrusive, and more in line with the mainstream values of China's current efforts to build a strong cultural nation and strengthen cultural confidence. This form of expression, which seamlessly blends values with plot and lines, undoubtedly provides a reference for the presentation of values in the process of short video production in the future.
On the whole, "Liu Yexi" has made its debut because of the rise of the concept --"metaverse" on the one hand, and it fits the current demand of users for short video content in multiple aspects on the other hand. In the future, short video producers should learn from the successful experience of Liu Yexi, and strive to explore the deep integration of technology, figure out user emotions, find the right positioning, and add to the output of more quality original content led by mainstream values.