Intelligent Algorithm Evaluation of Incidental English Vocabulary Acquisition in Complex Reading Tasks

: Vocabulary is the basis of learning a foreign language, and the cultivation of students’ language ability is an effective means to improve students’ ability to master vocabulary. In English teaching, how to effectively improve students’ vocabulary level is a matter of concern. The main purpose of this paper is to explore how to use intelligent algorithms to analyze and evaluate the effect of incidental English vocabulary acquisition. From the perspective of incidental vocabulary acquisition, this paper further proved that output reading could promote students’ English learning and provide some support for its application in practice. Through two groups of immediate tests and delayed tests, it was found that the learning efficiency of Class 1 of output group was higher than that of Class 2 of input group: 42.99>39.09>35.66>28.17, 16.78>14.50>14.49>12.22, 26.50>22.95> 19.32>15.90. In practical application, the study of this paper could not only provide some useful references for senior high school English teaching, but also provide some useful references for students’ choice of vocabulary and reading.


Introduction
Based on the input hypothesis, this paper combined vocabulary additional acquisition with language skills training.On this basis, the paper discussed the additional acquisition of vocabulary.English majors in senior high schools in China attach great importance to the cultivation of students' reading ability.They need to acquire information, process information, analyze and solve problems through English.For senior high school students, reading is not only a basic language ability, but also a key link in senior high school English teaching.This paper took high school students as the research object and combined English reading with incidental vocabulary acquisition.It examined the impact of input and output reading tasks on incidental vocabulary acquisition, which aimed to provide an effective way for senior English vocabulary teaching.
Since the 21st century, with the development of the Internet, tablet computers, smart phones and other high-tech, the way of learning English has become more and more innovative, and the learning of English vocabulary has become increasingly important.Wang B T proposed a pilot study pointed toward fostering a versatile application to further develop undergrads' English jargon learning through Chinese and English depictions [1].To assist understudies with tracking down seriously fascinating language illustrations, particularly jargon examples, in the English as an unknown dialect climate, Ghaemi F did a review to examine the effect of giving short message administration administrations through interpersonal organizations on students' jargon learning in this climate [2].Sadouki F expected to investigate the cross language effect of French on the English jargon learning of Algerian first level Arabic speakers who involved French as their subsequent language and English as their third language.The intention was to figure out what French information meant for English and what kinds of jargon students from other schools experience required subjective investigation [3].Sukirno M A planned to research the utilization of visual media and educators' difficulties in creating English jargon for hard of hearing understudies [4].Lolita Y was expected to devote herself to studying how to use computers to interpret English words, and comprehend the viability, effectiveness and fascination of computers helped language learning in English learning [5].However, the effects of different reading tasks should be taken into account for incidental vocabulary acquisition.
There are many researches on English vocabulary using intelligent algorithms.Kurt M's research aimed to promote vocabulary development by using Vine vocabulary videos in English vocabulary learning through intelligent algorithms [6].To work on the impact of English jargon acknowledgment, Duan L proposed a multi highlight combination versatile bit relationship channel following calculation in view of normal language handling calculation and corpus framework, focusing on the issues of piece connection channel calculation [7].These algorithms have promoted the research direction to a certain extent, but the research on the effect of applying intelligent algorithms to incidental English vocabulary acquisition is still rare.
It can be seen from the experimental results that in the process of learning, whether immediate vocabulary learning or delayed incidental vocabulary, the output reading task is better than the input type learning.The innovation of this paper is to use the "input hypothesis" theory to investigate the impact of various reading assignments made by senior high school students in the process of reading on English learning.This paper made an empirical analysis using the "additional vocabulary acquisition hypothesis" and the "input quantity hypothesis".

Intelligent Evaluation System
The intelligent assessment system for incidental English vocabulary acquisition is an intelligent learning system based on a complete knowledge base and a large number of question banks.Through in-depth data mining on the test questions [8][9], Figure 1 shows the basic block diagram of the system platform.The main function of the system is based on the user's answer.It uses a diagnosis engine to realize automation.It uses vector space model, matrix singular value decomposition, dynamic programming algorithm, minimum editing distance, etc. to automatically evaluate subjective questions.Based on knowledge points of vocabulary, users' evaluation of English knowledge is formed [10].
The implementation of standardized test questions is relatively simple, but it is difficult to automatically score and correct subjective questions in English teaching.This is because the user's way of thinking is flexible and the language is unstable and inflexible, which requires the computer to understand natural language well and involves complex operations such as language processing and dynamic planning [11][12].
From the test point of view, English writing is a comprehensive application of English knowledge, including vocabulary spelling, word collocation, grammar, sentence formation, mastering key points, planning layout, rhetorical style, etc. [13].The requirements of the English writing outline are very clear, and some key points are shown in Figure 2: to the point accurate and appropriate language well organized basic requirements Based on the characteristics of the examination syllabus and the practice of English writing, the specific requirements for English writing can be summarized: Correct spelling and proper vocabulary selection must be available; grammar rules should be mastered and sentence structures should be flexibly used; in language communication, attention should be paid to the coherence of the text to reflect the theme of the paper.
From the above analysis, we can see that although the data processing technologies based on rules and statistics have their own advantages.However, there are obvious shortcomings in the way of composition grading.According to the characteristics of English composition, some useful explorations have been made.

Implementation of Evaluation Algorithm
In the high school English test, the answers to standardized test questions are unique, so it is not difficult to score by computer.By using dynamic programming algorithm, natural language processing technology and computer language technology, a detailed evaluation of the impact of English vocabulary learning was conducted.The steps are as follows: The student's input problem is decomposed into a group of sentences according to certain rules, and the corresponding problem is decomposed into a group of corresponding rules, which are initialized by the rule engine.The cosine vector algorithm is used to calculate the similarity of the two sentences, and the initial value of the similarity is set.When the similarity exceeds the critical point, the statement would be matched with the filtered rules, including keywords, specific syntax, etc; the intermediate evaluation results are stored in the corresponding data.When selecting the threshold, it is necessary to obtain the results of repeated training of incidental English vocabulary acquisition [14][15].
From several aspects shown in Figure 3, the calculation is performed.
Minimum edit distance sentence similarity rule optimal solution Evaluation Algorithm Implementation (1) Minimum edit distance For the words that constitute sentences, we should judge whether they are a deformation in the lexicon.If it does not, its shortest editing distance should be found, and its spelling error should be determined within the limit of word length; if the spelling is wrong, it would be recorded in the wrong vocabulary, so that students can find their mistakes in time [16].


. Among them, ) , ( (2) Sentence similarity Text similarity is a very common method, which can filter rules by setting a threshold.It is also combined with the structural rules in the corpus to accurately judge the collocation of words, lexical structure, wording structure, mastery of essentials, layout, rhetorical style, etc.Therefore, in the diagnosis of English writing problems, how to correctly calculate the relevant sentences is very critical.
1) TF/IDF value Term Frequency (TF): The abstract meaning is that when a query contains the keyword m e e e ,..., , , its word frequency is The relationship between the query statement and the text is as follows: Inverse Document Frequency (IDF): If there is a keyword e in e H sentences, it means that the greater the e H , the lighter the weight of e, and vice versa.It is assumed that H is the total number of all sentences.The most commonly used is the "inverse text frequency index", which is expressed as follows: The weight value of each keyword is set to , and the correlation degree calculation formula becomes the weighted sum, as shown in the following formula: 2) Cosine vector algorithm On this basis, the similarity of the two texts can be obtained by using this method.It is assumed that the text is composed of irrelevant basic vocabulary ) ,..., , ( represents the TF/IDF value of the statement in the statement.The specific calculation method is: It is assumed that m is the number of occurrences of the word in the statement; n is the sum of all statements in which the word occurs in all other statements except the statement; N is the total number of body statements.Therefore, the formula is as follows: It can be seen from Formula 7 that the m value of words with high frequency of use would also increase, but the TF/IDF value of these words may not be high.Therefore, the word reflects the frequency of the occurrence of the word and its ability to recognize various sentences to some extent [17].
After the TF/DF of the sentence is standardized, the above cosine space vector is used for operation, and the space vectors of sentence Q and V are q  and v  .The similarity between the two sentences can be expressed by the cosine value of the angle between the two vectors q  and v  : Here, the denominator represents the length of the two vectors q and v; the numerator represents the inner product of the two vectors; S represents the angle of the two vectors.If the vectors of X and Y statements are If the cosine between two vectors is 1, both sentences are complete; when the cosine of the angle is closer to 1, the two sentences are more similar, so they can be classified into one category; when the cosine of the angle is less, the correlation between these two sentences would be reduced.
3) Singular value decomposition of matrix The cosine vector strategy is utilized to compute the comparability of text.Hypothetically, this technique is excellent.Nonetheless, its drawbacks are likewise self-evident.If the content of the paper is too long and requires a lot of calculations, the cosine theory cannot be applied.At present, the computer can compare 1000 papers at most in one second, and the similarity can only be calculated through repeated calculations.A large S matrix can be used to express the relationship between the 1 million papers and 500000 words.In this matrix, each line has a paragraph of text.(10) Particular worth disintegration is to break down a huge network into three little lattices and duplicate them as displayed in the accompanying recipe.
100 100 100 100 (11) Each row in the first column of X represents a category of words with related meanings.In these words, each non-zero element represents the TF/IDF value of the word.The size of the value is related to the degree of association of the word.In the last column of Y, each column represents the same topic, and each column represents a different paper.The matrix in the middle represents the association between words and papers.
(3) Rule optimal solution After the standards are instated, the last planning result can be perceived as a two-layered structure tree.In this tree, key is a gathering and worth is a bunch of planning structures.After students' papers are divided into several sentences, each sentence is compared with the rules in the map, and then the same rules are used for comparison to get the highest score, and then the corresponding rules are removed.Until every one of the standards are spent, the last score is the last score of the arrangement.This is a common powerful programming issue, which is likewise the ideal arrangement.This strategy can be roughly treated as a 0-1 backpack issue, which is a non-deterministic polynomial-complete issue in computational hypothesis.This technique can change the backpack issue into the amount of the most extreme scores of all rules when the quantity of rules is restricted.The numerical equation is as per the following: The objective functions are as follows: represents the match between a rule o and a statement.When , it means that the rule o fails to match the sentence.Generally, the recursive backtracking method is used.However, this method is carried out in the whole search space, so a total number of combinations is m 2 .Therefore, with the increase of rule number m, the solution space would increase by m 2 levels.When m reaches a certain value, genetic algorithm can be used to solve it.

Investigation and Design of English Vocabulary Based on Intelligent Algorithm
The use of intelligent algorithms can help analyze the effect of incidental English vocabulary acquisition in this paper.In this paper, senior three students in a senior high school are taken as the experimental objects.Both classes are liberal arts parallel classes.Among them, there are 50 students in Class 1 of Senior Three, including 21 male students and 29 female students; there are 50 students in Class 2 of Senior Three, including 23 boys and 27 girls, with an average age of 18.In this paper, the experimental group is Class 1, Grade 3, and the control group is Class 2, Grade 3. Before collecting the data, all the subjects had five to six years of English learning experience, and they had just passed the middle school examination and had formed a basic vocabulary learning concept.By inputting students' English scores into Statistical Product Service Solutions (SPSS) software to test their comprehensive English ability and vocabulary ability, it is found that the average score of Class 1 is 83.42, with the highest score of 120 and the lowest score of 46.The average score of Class 2 is 82.26, with the highest score of 117 and the lowest score of 53, as shown in Figure 4 [18].From the average score, the average score of Class 1 is slightly higher than that of Class 2. In order to verify the difference between the two classes' English abilities, this paper uses T to test the students' academic achievements.The results are shown in Table 1.The results show that Sig.(bilateral)=0.84>0.05.Therefore, although the average score of the two classes in the experimental group is higher than that in the control group, there is no significant difference in the academic performance between the two classes.
Through the test on the vocabulary of the experimental subjects, it is found that most of the vocabulary is about 2000 words.Some of them are about 3000 words, and a few of them have vocabulary less than 2000 words.
To test the correctness of this study, a preliminary test was conducted on 10 subjects on February 28, 2022.The study found that no matter how good the students' English reading ability is, they can also obtain the corresponding vocabulary through instant and delayed tests while extensive reading.However, the correlation between reading ability and vocabulary acquisition is not significant.In addition, the performance of output group is better than that of input group in real-time test and delay test.In order to verify the above conclusions, this paper selects two parallel liberal arts classes as the experimental objects to conduct an actual investigation.
The study was divided into two groups: Class 1, Grade 3, was the experimental group.The reading content included the materials of the output reading task, that is, the reading materials containing multiple annotations.After reading, sentence making and translation of new words were carried out.As a control group, Class 2, Grade 3, its reading materials are input type reading materials, that is, reading materials containing multiple annotations.After reading, they do multiple choice questions for reading comprehension.
The study consisted of three reading texts with different tasks, which included 50 words.The first reading was conducted in the first class on the morning of March 7, 2022, and the time was 20 minutes (depending on the results of the experiment).In order not to affect the classroom teaching and ensure that students take it seriously, they do not enter the classroom and they only distribute materials and collect papers and distribute test papers at the same time.They notify the teacher to speak only once in the exercise without any test.After the students of the two classes finished their homework, the materials were collected and an impromptu word test was given out.After seven minutes, all the words would be recalled (the time depends on the pilot test), the spelling of English would be given, and the subjects would be asked to write according to the meaning of Chinese.
The delayed test would be conducted one week later, March 14, 2022.The test would not provide any reading materials, and the delayed test paper of target vocabulary would be issued directly and retrieved within ten minutes.
The second reading and the third reading are conducted in the same order after the first reading.On March 21, 2022 and March 28, 2022, two real-time tests were conducted on reading; the delay test would be conducted on April 6 and 13, 2022.
At the end of each examination, the same criteria shall be used for scoring.The data were collected and input into SPSS for analysis.After completing the third reading delay test, this paper received 100 questionnaires, and 100 of them were valid.

Investigation Results and Evaluation
In order to solve the relationship between English reading ability and incidental vocabulary learning, this paper collects the reading comprehension scores of the subjects in the Senior High School English Proficiency Test and divides them into five grades according to the situation.The subjects' English reading ability is shown in Figure 5: As can be seen from Figure 5, students' reading ability shows a normal trend.The basic level is between level II and level III, accounting for 94.00%.
Figure 6 shows the correlation between English reading level and immediate and delayed incidental vocabulary acquisition.Figure 6 (a) shows the correlation between average scores of English reading and immediate incidental vocabulary learning.The study found that reading level had no significant impact on immediate incidental vocabulary learning.In order to better understand the relationship between English reading ability and incidental vocabulary learning, the paper shows English reading and delayed incidental vocabulary in the form of figures, as shown in Figure 6 (b).
Figure 6 (b) shows the correlation between average scores of English reading and delayed incidental vocabulary learning.The results show that reading level has no significant impact on delayed incidental vocabulary learning, and there is no significant regularity.Therefore, this paper boldly speculates that students' English reading ability is not related to their vocabulary.
In order to know the scores of words learned by students after completing various reading assignments, vocabulary learning tests are conducted immediately after completing the assignments.The test paper is conducted according to the unified standard.The results of the three reading instant vocabulary tests are shown in Figure 7: Figure 7 (a) shows the average score and standard deviation of immediate incidental vocabulary acquisition test scores in the first reading task.The average score of the input group was 35.66, and the standard deviation was 10.73; the average score of the output group was 42.99, and the standard deviation was 12.68.Figure 7 (b) shows the average score and standard deviation of immediate vocabulary learning ability in the second reading task.The average value of the input group was 14.50, and the standard deviation was 5.83; the average score of the output group was 16.78, and the standard deviation was 5.11. Figure 7 (c) shows the average score and standard deviation of the immediate incidental vocabulary acquisition test scores in the two reading tasks of the third reading.The average score of the input group was 19.32, and the standard deviation was 5.73; the average score of the output group was 26.50, and the standard deviation was about 7.29.It can be seen from Figure 7 that both reading tasks can enable senior three students to obtain additional words when reading, and the learning effect of the output task group is significantly better than that of the input The study found that the independent sample T test (Table 2) of the instant vocabulary of the two reading tasks showed: Sig.(bilateral)=0.01<0.05,Sig.(bilateral)=0.04<0.05,Sig.(bilateral)=0.00<0.05.The results show that there is a significant difference between output reading tasks and input reading tasks in the learning effect of immediate incidental vocabulary, and the learning efficiency of output reading tasks is higher than that of input reading tasks.
Through the study of delayed incidental vocabulary acquisition, this paper studies the influence of different types of English learning tasks on incidental vocabulary acquisition.The results are shown in Figure 8 The results of Figure 8 (a) show that in the first reading, the average score and standard deviation in the input group are 28.17 and 8.70 respectively; the mean and standard deviation of the output group were 39.09 and 11.70, respectively.The results of Figure 8 (b) show that in the second reading, the average score and standard deviation in the input group are 12.22 and 5.47 respectively; the mean and standard deviation of the output group are 14.49 and 4.54, respectively.The results of Figure 8 (c) show that in the third reading, the average score and standard deviation in the input group are 15.90 and 4.47 respectively, and the average score and standard deviation in the output group are 22.95 and 5.36 respectively.The results showed that one week later, the learning effect of the incidental vocabulary brought by the two reading tasks was well maintained, and the learning situation of the incidental vocabulary in the output group was also significantly better than that in the input group.
Table 3 shows the independent sample T-test of the delayed vocabulary test for the two reading tasks, which are shown as follows: Sig (bilateral)=0.00<0.05,Sig.(bilateral)=0.02<0.05,Sig.(bilateral)=0.00<0.05.The results show that there is a significant difference between input group and output group in delayed incidental vocabulary acquisition.To understand the role of two different reading tasks in English learning, Table 4 is arranged as follows: Table 4 clearly shows the results of the immediate and delayed tests of the two groups.In general, the learning effect of the output group (Class 1) is better than that of the input group (Class 2) (42.99>39.09>35.66>28.17;16.78>14.50>14.49>12.22;26.50>22.95>19.32>15.90).In addition, it can be seen that the delay test of the two groups has decreased compared with the instant test, with the average scores decreased by 7.49 and 3.90 respectively; 2.28, 2.29; 3.42, 3.55.This is related to forgetting in long-term memory.As time goes on, long-term memory would be lost.However, 3.90 of the output group is smaller than 7.49 of the input group.In the third reading, 3.55 of the output group is larger than 3.42 of the input group.Therefore, in general, the retention effect of the input group on incidental vocabulary acquisition is not as good as that of the output group on incidental vocabulary acquisition, but there are some exceptions.

Figure 1 :
Figure 1: Basic framework of the system

Figure 3 :
Figure 3: Evaluation algorithm steps recursive definition is as follows:

,
the cosine of their included angle is as follows:

Figure 4 :
Figure 4: The simulated score of the first college entrance examination

Figure 6 :
Figure 6: The relationship between English reading level and immediate and delayed incidental vocabulary acquisition

3 Figure 7 :
Figure 7: Statistical analysis of instant vocabulary test

Figure 8 :
Figure 8: Statistical analysis of delayed vocabulary test It is assumed that the student's answer is W and the standard answer is H, and the similarity between the two sentences can be expressed by the corresponding vector angle.The smaller the angle, the greater the similarity between the two sentences.It is assumed that the basic vocabulary of the standard answer is o  

Table 1 :
Independent sample T test of two groups of subjects

Table 2 :
Independent sample T-test of instant vocabulary tests for two reading tasks

Table 3 :
Independent sample T-test of delayed vocabulary test for two reading tasks

Table 4 :
Analysis of instant test and delay test