A Transferable Retrieval-Augmented Generation Framework for Vertical-Domain Question Answering: From Academic Competitions to Cultural Tourism and Financial Technology
DOI: 10.23977/acss.2026.100104 | Downloads: 2 | Views: 61
Author(s)
Yongye Huang 1
Affiliation(s)
1 School of Mathematics and Statistics, Hanshan Normal University, Chaozhou, Guangdong, China
Corresponding Author
Yongye HuangABSTRACT
Vertical-domain question answering often relies on domain-specific retrieval pipelines and prompt designs, which limits robustness when transferred across heterogeneous domains. This paper presents a transferable Retrieval-Augmented Generation framework, where Retrieval-Augmented Generation (RAG) integrates external knowledge retrieval with large language model generation for grounded answering. The proposed framework targets cross-domain transfer from academic competition problem solving to cultural tourism services and financial technology applications by unifying query normalization, hybrid retrieval, and citation-consistent generation. Specifically, a domain router predicts an inference policy that adaptively configures sparse retrieval, dense retrieval, and neural re-ranking, while a query rewriting module converts user questions into a structured canonical form to reduce domain shift. Retrieved evidence is further standardized through evidence canonicalization to provide a consistent input schema for downstream generation. To improve reliability, the generation module incorporates evidence alignment and post-generation verification to reduce unsupported statements and enhance citation correctness. A transfer-oriented training strategy is introduced by combining contrastive retrieval learning, lightweight domain adaptation, and domain-invariant regularization, enabling effective adaptation under limited target-domain supervision. Experiments across three representative scenarios demonstrate that the framework improves answer accuracy, evidence recall, and citation consistency under both in-domain evaluation and few-shot transfer settings, indicating strong transferability and practical potential for deployable vertical-domain question answering systems.
KEYWORDS
Retrieval-Augmented Generation; Transfer Learning; Vertical-Domain Question Answering; Hybrid Retrieval; Domain Routing; Evidence Canonicalization; Citation ConsistencyCITE THIS PAPER
Yongye Huang. A Transferable Retrieval-Augmented Generation Framework for Vertical-Domain Question Answering: From Academic Competitions to Cultural Tourism and Financial Technology. Advances in Computer, Signals and Systems (2026) Vol. 10: 27-38. DOI: http://dx.doi.org/10.23977/acss.2026.100104.
REFERENCES
[1] Lewis P, Perez E, Piktus A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems (NeurIPS), 2020. DOI: 10.48550/arXiv.2005.11401.
[2] Izacard G, Lewis P, Lomeli M, et al. Atlas: Few-shot Learning with Retrieval Augmented Language Models. Journal of Machine Learning Research, 2023, 24: 251:1–251:43. DOI: 10.5555/3648699.3648950.
[3] Borgeaud S, Mensch A, Hoffmann J, et al. Improving Language Models by Retrieving from Trillions of Tokens. International Conference on Machine Learning (ICML), 2022. DOI: 10.48550/arXiv.2112.04426.
[4] Stolfo A. Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study. Findings of the Association for Computational Linguistics: NAACL, 2024. DOI: 10.18653/v1/2024.findings-naacl.100.
[5] Roy N, Ribeiro L F R, Blloshmi R, Small K. Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA. Findings of the Association for Computational Linguistics: EMNLP, 2024. DOI: 10.18653/v1/2024.findings-emnlp.622.
[6] Izacard G, Grave E. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. Proceedings of the EACL, 2021. DOI: 10.18653/v1/2021.eacl-main.74.
[7] Santhanam K, Khattab O, Saad-Falcon J, et al. ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction. Proceedings of NAACL-HLT, 2022. DOI: 10.18653/v1/2022.naacl-main.272.
[8] Thakur N, Reimers N, Rücklé A, et al. BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. NeurIPS Datasets and Benchmarks Track, 2021. DOI: 10.48550/arXiv.2104.08663.
[9] Petroni F, Piktus A, Fan A, et al. KILT: A Benchmark for Knowledge Intensive Language Tasks. Proceedings of NAACL-HLT, 2021. DOI: 10.18653/v1/2021.naacl-main.200.
[10] Formal T, Piwowarski B, Clinchant S. SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking. Proceedings of SIGIR, 2021. DOI: 10.1145/3404835.3463098.
[11] Dasigi P, Lo K, Beltagy I, et al. A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers (Qasper). Proceedings of NAACL-HLT, 2021. DOI: 10.18653/v1/2021.naacl-main.365.
[12] Es S, James J, Guitton L, et al. RAGAs: Automated Evaluation of Retrieval Augmented Generation. Proceedings of EACL (System Demonstrations), 2024. DOI: 10.18653/v1/2024.eacl-demo.16.
[13] Rau D, Déjean H, Chirkova N, et al. A Benchmarking Library for Retrieval-Augmented Generation (BERGEN). Findings of the Association for Computational Linguistics: EMNLP, 2024. DOI: 10.18653/v1/2024.findings-emnlp.449.
[14] Tahaei M, et al. Efficient Citer: Tuning Large Language Models for Enhanced Answer Quality and Verification. Findings of the Association for Computational Linguistics: NAACL, 2024. DOI: 10.18653/v1/2024.findings-naacl.277.
[15] Ramu P, et al. Enhancing Post-Hoc Attributions in Long Document Question Answering. Proceedings of EMNLP, 2024. DOI: 10.18653/v1/2024.emnlp-main.985.
| Downloads: | 43039 |
|---|---|
| Visits: | 928279 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Computing, Performance and Communication Systems
-
Journal of Artificial Intelligence Practice
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks

Download as PDF