Education, Science, Technology, Innovation and Life
Open Access
Sign In

The Training Process and Methods for LLMs Using an Own Knowledge Base

Download as PDF

DOI: 10.23977/jaip.2024.070306 | Downloads: 30 | Views: 1136

Author(s)

Sheng Zhiyuan 1

Affiliation(s)

1 Wuhan Foreign Languages School, Wuhan, Hubei, China

Corresponding Author

Sheng Zhiyuan

ABSTRACT

This paper explores the development of frameworks and training methods for large language models (LLMs), focusing on the importance of self-built data (Own data or Own Knowledge Base), specific processes of model pre-training and fine-tuning, and model performance evaluation and deployment effects. By introducing and analysing the advantages and disadvantages of mainstream large language models (such as GPT-4, BERT, LLaMA, and Mistral), we illustrate the strengths and limitations of large language models in natural language processing tasks. This paper particularly emphasises the critical role of self-built data in enhancing the model's professionalism and accuracy, discussing data collection and processing methods. We detail the steps of model pre-training and their impact on model performance, explore the necessity and implementation of model fine-tuning, and validate the effectiveness of the proposed framework training method through performance evaluation metrics and actual deployment effects.

KEYWORDS

Large Language Models, LLMs, Own Data, Own Knowledge Base, Pre-Training, Fine-Tuning, Performance Evaluation, NLP, CV

CITE THIS PAPER

Sheng Zhiyuan, The Training Process and Methods for LLMs Using an Own Knowledge Base. Journal of Artificial Intelligence Practice (2024) Vol. 7: 41-47. DOI: http://dx.doi.org/10.23977/jaip.2024.070306.

REFERENCES

[1] Wolf T., Chaumond J., Debut L., Sanh V., & Delangue C. (2020). "Transformers: State-of-the-Art Natural Language Processing." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp 38-45. 
[2] Strubell E., Ganesh A., & McCallum A. (2019). "Energy and Policy Considerations for Deep Learning in NLP." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3645-3650.
[3] Radford A., Narasimhan K., Salimans T., & Sutskever I. (2018). "Improving Language Understanding by Generative Pre-Training." OpenAI, pp. 1-12.
[4] He P., Liu X., Gao J., & Chen W. (2021). "DeBERTa: Decoding-enhanced BERT with Disentangled Attention." In International Conference on Learning Representations, pp 1-19.
[5] Liu Zhiyuan, Sun Maosong. (2019). "Natural Language Processing: Methods Based on Pre-trained Models." Beijing: Science Press, pp. 15-20. 
[6] Wang Liwei, Zhang Min. (2020). "Deep Learning-Based Natural Language Processing." Beijing: People's Posts and Telecommunications Press, pp. 85-92.
[7] Jiang Tianzai, Liu Peng, Yang Jian. (2021). "Deep Learning and Natural Language Processing: Algorithms, Models, and Applications." Beijing: Electronic Industry Press, pp. 110-120.

Downloads: 15127
Visits: 485223

Sponsors, Associates, and Links


All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.