A Review of Hyperparameter Tuning Methods for Reinforcement Learning——Taking DQN, PPO, and A3C Algorithms as Examples
DOI: 10.23977/cpcs.2026.100101 | Downloads: 0 | Views: 25
Author(s)
Hongyuan Liu 1
Affiliation(s)
1 School of Electronic and Information Engineering, University of Science and Technology Liaoning, Anshan, China
Corresponding Author
Hongyuan LiuABSTRACT
The performance of reinforcement learning algorithms is highly dependent on hyperparameter configuration. Unreasonable hyperparameter settings can easily lead to training non-convergence, slow convergence, or poor policy performance. This paper takes three mainstream algorithms in the field of deep reinforcement learning, namely DQN (Deep Q-Network), PPO (Proximal Policy Optimization), and A3C (Asynchronous Advantage Actor-Critic), as the research objects. It systematically sorts out the key hyperparameters and core action mechanisms of each algorithm, and summarizes the general hyperparameter tuning rules and targeted strategies through literature research. To verify the effectiveness of the tuning methods, small-scale comparative experiments are designed to test the convergence performance and final policy effect of different hyperparameter configurations in classic Gym environments. Based on the experimental results and literature analysis, a practical parameter setting guide for beginners is extracted to reduce the application threshold of reinforcement learning algorithms. The research shows that reasonable hyperparameter tuning can increase the algorithm convergence speed by more than 30% and the final performance by about 15%, among which the learning rate, exploration strategy-related parameters, and regularization coefficients have the most significant impact on algorithm performance.
KEYWORDS
Reinforcement Learning; Hyperparameter Tuning; DQN; PPO; A3C; Parameter Setting GuideCITE THIS PAPER
Hongyuan Liu. A Review of Hyperparameter Tuning Methods for Reinforcement Learning——Taking DQN, PPO, and A3C Algorithms as Examples. Computing, Performance and Communication Systems (2026) Vol. 10: 1-8. DOI: http://dx.doi.org/10.23977/cpcs.2026.100101.
REFERENCES
[1] Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." nature 518.7540 (2015): 529-533.
[2] Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
[3] Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. PmLR, 2016.
[4] Eimer, Theresa, Marius Lindauer, and Roberta Raileanu. "Hyperparameters in reinforcement learning and how to tune them." International conference on machine learning. PMLR, 2023.
[5] Liaw, Richard, et al. "Tune: A research platform for distributed model selection and training." arXiv preprint arXiv:1807.05118 (2018).
[6] Wang, Linnan, Rodrigo Fonseca, and Yuandong Tian. "Learning search space partition for black-box optimization using monte carlo tree search." Advances in Neural Information Processing Systems 33 (2020): 19511-19522.
[7] Huang, Shengyi, et al. "Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms." Journal of Machine Learning Research 23.274 (2022): 1-18.
| Downloads: | 3476 |
|---|---|
| Visits: | 232402 |
Sponsors, Associates, and Links
-
Power Systems Computation
-
Internet of Things (IoT) and Engineering Applications
-
Journal of Artificial Intelligence Practice
-
Advances in Computer, Signals and Systems
-
Journal of Network Computing and Applications
-
Journal of Web Systems and Applications
-
Journal of Electrotechnology, Electrical Engineering and Management
-
Journal of Wireless Sensors and Sensor Networks
-
Journal of Image Processing Theory and Applications
-
Mobile Computing and Networking
-
Vehicle Power and Propulsion
-
Frontiers in Computer Vision and Pattern Recognition
-
Knowledge Discovery and Data Mining Letters
-
Big Data Analysis and Cloud Computing
-
Electrical Insulation and Dielectrics
-
Crypto and Information Security
-
Journal of Neural Information Processing
-
Collaborative and Social Computing
-
International Journal of Network and Communication Technology
-
File and Storage Technologies
-
Frontiers in Genetic and Evolutionary Computation
-
Optical Network Design and Modeling
-
Journal of Virtual Reality and Artificial Intelligence
-
Natural Language Processing and Speech Recognition
-
Journal of High-Voltage
-
Programming Languages and Operating Systems
-
Visual Communications and Image Processing
-
Journal of Systems Analysis and Integration
-
Knowledge Representation and Automated Reasoning
-
Review of Information Display Techniques
-
Data and Knowledge Engineering
-
Journal of Database Systems
-
Journal of Cluster and Grid Computing
-
Cloud and Service-Oriented Computing
-
Journal of Networking, Architecture and Storage
-
Journal of Software Engineering and Metrics
-
Visualization Techniques
-
Journal of Parallel and Distributed Processing
-
Journal of Modeling, Analysis and Simulation
-
Journal of Privacy, Trust and Security
-
Journal of Cognitive Informatics and Cognitive Computing
-
Lecture Notes on Wireless Networks and Communications
-
International Journal of Computer and Communications Security
-
Journal of Multimedia Techniques
-
Automation and Machine Learning
-
Computational Linguistics Letters
-
Journal of Computer Architecture and Design
-
Journal of Ubiquitous and Future Networks

Download as PDF