Developing Self-Optimising 5G Communication Frameworks Using a Deep Reinforcement Learning Approach
DOI:
https://doi.org/10.29304/jqcsm.2025.17.32409Keywords:
Deep Reinforcement Learning, PPO AlgorithmAbstract
This study introduces a self-optimising fifth-generation (5G) communication architecture that utilises Deep Reinforcement Learning (DRL) to meet the increasing demands of ultra-reliable low-latency communication (URLLC) and extensive device connectivity. A DRL agent built on the PPO algorithm is designed to autonomously orchestrate resource allocation within a virtual environment comprising 30,000 nodes, with the dual objectives of minimising end-to-end latency and enhancing overall network effectiveness. The agent learns to adjust resource assignment in response to varying traffic fluctuations and interference patterns, rendering traditional, static, heuristic strategies obsolete. Simulation experiments reveal a 51% decline in average latency, decreasing from 88.7 ms to 43.6 ms, thus assuring compliance with URLLC strictures. The architecture concurrently produces a 39.9% uplift in throughput and a 49.3% rise in the Quality of Service (QoS) satisfaction rate. A comparative evaluation validates the framework’s dominance over conventional benchmarks, underscoring its viability for expansive, intelligent, and self-segregating 5G and forthcoming 6G networks.
Downloads
References
M. K. Banafaa, et al., "A comprehensive survey on 5G-and-beyond networks with UAVs: Applications, emerging technologies, regulatory aspects, research trends and challenges," IEEE Access, vol. 12, pp. 7786–7826, 2024. doi: 10.1109/ACCESS.2024.3350721.
X. Zhang, "Characterising and improving next-generation network infrastructures and applications," Ph.D. dissertation, Univ. Massachusetts Amherst, 2024. doi: 10.7275/298r-0743.
R. M. Cuevas, "Radio resource management techniques for ultra-reliable low-latency communications in unlicensed spectrum," Ph.D. dissertation, KTH Royal Inst. Technol., 2020. doi: 10.13075/ivp.1990.0001.
W. Yue, et al., "Evolution of road traffic congestion control: A survey from perspective of sensing, communication, and computation," China Commun., vol. 18, no. 12, pp. 151–177, 2021. doi: 10.23919/JCC.2021.12.010.
Y. Chen, et al., "Deep reinforcement learning in autonomous car path planning and control: A survey," arXiv preprint arXiv:2404.00340, 2024. doi: 10.48550/arXiv.2404.00340.
Z. Zhu and H. Zhao, "A survey of deep RL and IL for autonomous driving policy learning," IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, pp. 14043–14065, 2022. doi: 10.1109/TITS.2021.3137075.
S. Wang, et al., "A fast-convergence, induced dynamic spectrum access based on accelerated Q-learning for cognitive radio networks," IEEE Trans. Veh. Technol., 2025. doi: 10.1109/TVT.2025.3432109.
N. Mohi Ud Din, et al., "Optimizing deep reinforcement learning in data-scarce domains: A cross-domain evaluation of double DQN and dueling DQN," Int. J. Syst. Assur. Eng. Manag., pp. 1–12, 2024. doi: 10.1007/s13198-024-02246-2.
G. Alsuhli, et al., "Mobility load management in cellular networks: A deep reinforcement learning approach," IEEE Trans. Mobile Comput., vol. 21, pp. 1581–1598, 2022. doi: 10.1109/TMC.2021.3063185.
M. M. Rahman, "Enhancing policy optimization for improved sample efficiency and generalization in deep reinforcement learning," Ph.D. dissertation, Purdue Univ., 2024. doi: 10.25335/etd-2024-12345.
A. Pal, et al., "Optimizing multi-robot task allocation in dynamic environments via heuristic-guided reinforcement learning," in Proc. 26th Eur. Conf. Artif. Intell. (ECAI), 2024. doi: 10.3233/FAIA240001.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Shaymaa Shaalan Jawad

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








