Problems in the Optimization Work of Speech-Text Auto-Recognition and Relevant Possible Solutions
DOI: 10.23977/jaip.2024.070310 | Downloads: 26 | Views: 913
Author(s)
Xiwen Qin 1
Affiliation(s)
1 University of Shanghai for Science and Technology, Shanghai, 200093, China
Corresponding Author
Xiwen QinABSTRACT
This thesis explores the problems occurred in the annotation work of language audios and possible solutions after analysis and judgement. After Part One Introduction of the industry and Part Two clarification of research methods, Part Three delves into various actual issues encountered in the ASR optimization work and their influence. It utilizes and analyzes real-world investigation data to pinpoint these issues and their impact on the effectiveness of ASR. Part Four examines the solutions of the possible problems proposed, one of which is Cohen's Kappa metrics being successfully applied in an experiment. Part Five is the study of the real application of the methods. This section first explores the generation and optimization problems from a psycho-linguistic perspective before finding out various methods and plans that could enhance the accuracy and efficiency of the annotation process. The goal of this thesis is to provide readers with a comprehensive understanding of both the current situation and further direction of audio annotation. By analyzing current challenges and exploring potential advancements, this thesis is dedicated to provide readers with a thorough understanding of the current state of audio annotation and its future trajectory. It contributes valuable insights that can pave the way for more robust and efficient audio annotation practices, ultimately leading to improved performance in ASR systems.
KEYWORDS
ASR; audio annotation; speech-to-text; psycholinguistics; Artificial Neural NetworkCITE THIS PAPER
Xiwen Qin, Problems in the Optimization Work of Speech-Text Auto-Recognition and Relevant Possible Solutions. Journal of Artificial Intelligence Practice (2024) Vol. 7: 83-94. DOI: http://dx.doi.org/10.23977/jaip.2024.070310.
REFERENCES
[1] Text annotation. Papers with Code. (n.d.). https://paperswithcode.com/task/text-annotation
[2] What is audio annotation, what are the applications and benefits. clickworker.com. (2023, January 16). https://www. clickworker.com/ai-glossary/audio-annotation/
[3] Four key metrics for ensuring data annotation accuracy | telus international. (n.d.). https://www. telusinternational.com/insights/ai-data/article/data-annotation-metrics
[4] AI, S. (2021, December 15). Inter-annotator agreement: An introduction to cohen's kappa statistic. Medium. https://surge-ai.medium.com/inter-annotator-agreement-an-introduction-to-cohens-kappa-statistic-dcc15ffa5ac4
[5] Ahlsén, E. (2006). Introduction to neurolinguistics. John Benjamins.
[6] Sussex Publishers. (n.d.). How the brain's mirror neurons affect empathy. Psychology Today. https://www. psychologytoday. com/intl/blog/emotional-freedom/202206/how-the-brains-mirror-neurons-affect-empathy
[7] Author links open overlay panelEdmondo Trentin a, a, b, AbstractIn spite of the advances accomplished throughout the last decades, Bridle, J. S., Chen, W. Y., Chung, Y. J., Elman, J. L., Franco, H., Jang, C. S., Bell, A. J., Bengio, Y., Bourlard, H., Cerf, P. L., Chang, P. C., Cosi, P., Cybenko, G., Davis, S. B., Mori, R. D., … Hertz, J. (2001a, February 27). A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing. https://www. sciencedirect. com/science/article/abs/pii/S0925231200003088
[8] Zacarias-Morales, N., Pancardo, P., Hernández-Nolasco, J. A., & Garcia-Constantino, M. (2021, January 28). Attention-inspired artificial neural networks for Speech Processing: A Systematic Review. MDPI. https://www.mdpi.com/2073-8994/13/2/214
Downloads: | 15127 |
---|---|
Visits: | 485229 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks