A Reinforcement Learning–Driven Scheduler for Minimizing Uplink Delay in 5G Networks
DOI:
https://doi.org/10.29304/jqcsm.2025.17.42560Keywords:
5G, uplink scheduling, reinforcement learning, SARSA, delay minimization, radio resource management.Abstract
Uplink scheduling is a core challenge in 5G New Radio (NR), where diverse services—enhanced Mobile Broadband (eMBB), ultra-Reliable Low-Latency Communications (URLLC), and massive Machine-Type Communications (mMTC)—compete for shared spectrum under stringent delay and reliability constraints. Traditional policies (Round Robin, Best-CQI, Proportional Fairness) are simple and effective in limited regimes, but they expose well-known drawbacks in the uplink: RR preserves opportunity fairness yet struggles to suppress backlog under load; Best-CQI maximizes instantaneous rate but can starve cell-edge users; and PF lacks the agility to re-prioritize when traffic mixes or deadlines shift, leading to elevated tail latencies and delay-budget violations. This paper proposes an adaptive reinforcement-learning (RL) scheduler based on on-policy SARSA to minimize uplink delay while maintaining efficient spectrum use. The state encodes per-UE buffer status reports (BSR) and achievable rates (quantized for tractability), actions select a UE per resource-block group (RBG) at each slot, and a delay-aware reward (negative sum of BSRs) directly penalizes aggregate backlog. We implement the scheduler in a slot-driven 5G NR simulator with asynchronous HARQ and compare against RR, Best-CQI, and backpressure. Beyond average BSR, we evaluate end-to-end (E2E) delay, 95th/99th-percentile latency, URLLC delay-violation ratio (DVR), eMBB throughput, and mMTC delivery ratio. To address scalability and realism, we extend experiments from 4 UEs to 16–64 UEs and include mixed eMBB/URLLC/mMTC traffic. Results show that SARSA nearly matches RR on mean and tail delay while substantially reducing URLLC DVR relative to Best-CQI; under mixed traffic it preserves URLLC reliability close to RR yet improves eMBB throughput via opportunistic allocations. Stability analyses under high offered load and fast fading indicate bounded queues and improved robustness compared with Best-CQI and backpressure. These findings demonstrate a practical path to learning-enhanced, delay-conscious uplink scheduling within standards-conformant 5G stacks.
Downloads
References
A. Anand, G. de Veciana, and S. Shakkottai, “Joint Scheduling of URLLC and eMBB Traffic in 5G Wireless Networks,” IEEE/ACM Transactions on Networking, vol. 28, no. 2, pp. 477–490, Apr. 2020, doi: 10.1109/TNET.2020.2968373.
S. K. Vankayala and K. G. Shenoy, “A Neural Network for Estimating CQI in 5G Communication Systems,” in 2020 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), IEEE, Apr. 2020, pp. 1–5. doi: 10.1109/WCNCW48565.2020.9124744.
G. Pocovi, A. A. Esswie, and K. I. Pedersen, “Channel Quality Feedback Enhancements for Accurate URLLC Link Adaptation in 5G Systems,” in 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), IEEE, May 2020, pp. 1–6. doi: 10.1109/VTC2020-Spring48590.2020.9128909.
A. T. Z. Kasgari and W. Saad, “Model-Free Ultra Reliable Low Latency Communication (URLLC): A Deep Reinforcement Learning Framework,” in ICC 2019 - 2019 IEEE International Conference on Communications (ICC), IEEE, May 2019, pp. 1–6. doi: 10.1109/ICC.2019.8761721.
H. Yin, X. Guo, P. Liu, X. Hei, and Y. Gao, “Predicting Channel Quality Indicators for 5G Downlink Scheduling in a Deep Learning Approach,” Aug. 2020.
V. Shilpa and R. Ranjan, “Radio Resource Scheduling in 5G Networks Based on Adaptive Golden Eagle Optimization Enabled Deep Q-Net,” SN Comput Sci, vol. 5, no. 5, p. 517, May 2024, doi: 10.1007/s42979-024-02856-8.
F. Al-Tam, N. Correia, and J. Rodriguez, “Learn to Schedule (LEASCH): A Deep Reinforcement Learning Approach for Radio Resource Scheduling in the 5G MAC Layer,” IEEE Access, vol. 8, pp. 108088–108101, 2020, doi: 10.1109/ACCESS.2020.3000893.
I.-S. Comsa et al., “Towards 5G: A Reinforcement Learning-Based Scheduling Solution for Data Traffic Management,” IEEE Transactions on Network and Service Management, vol. 15, no. 4, pp. 1661–1675, Dec. 2018, doi: 10.1109/TNSM.2018.2863563.
3GPP, “NR FDD Scheduling Performance Evaluation,” 2021.
J. Lee, S. Jung, S.-E. Hong, and H. Lee, “Development on Open-RAN Simulator with 5G-LENA,” in 2024 International Conference on Information Networking (ICOIN), IEEE, Jan. 2024, pp. 176–178. doi: 10.1109/ICOIN59985.2024.10572115.
C. Stanescu, T. Paunescu, G. Predusca, L. D. Circiumarescu, N. Angelescu, and D. C. Puchianu, “Performance Evaluation of CDMA and GSM Systems through NetSim Simulations,” in 2025 33rd Mediterranean Conference on Control and Automation (MED), IEEE, Jun. 2025, pp. 209–214. doi: 10.1109/MED64031.2025.11073525.
V. Stoynov, D. Mihaylova, Z. Valkova-Jarvis, G. Iliev, and V. Poulkov, “An Investigation of Flexible Waveform Numerologies for 5G V2I Cellular Networks from a Physical Layer Perspective,” in 2019 IEEE International Conference on Microwaves, Antennas, Communications and Electronic Systems (COMCAS), IEEE, Nov. 2019, pp. 1–6. doi: 10.1109/COMCAS44984.2019.8958075.
S. H. Oleiwi, S. S. Gunasekaran, K. I. Abdulameer, M. Abed Mohammed, and M. A. Mahmoud, “Securing Real-Time Data Transfer in Healthcare IoT Environments with Blockchain Technology,” Mesopotamian Journal of CyberSecurity, vol. 4, no. 3, pp. 291–317, Dec. 2024, doi: 10.58496/MJCS/2024/028.
I.-S. Comşa, P. Bergamin, G.-M. Muntean, P. Shah, and R. Trestian, “FAIR-Q: Fairness and Adaptive Intelligent Resource Management with QoS Optimization in Dynamic 6G Radio Access Networks,” in 2025 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), IEEE, Jun. 2025, pp. 1–7. doi: 10.1109/BMSB65076.2025.11165543.
R. Tuninato, G. Maiolini Capez, N. Mazzali, and R. Garello, “5G New Radio for Non-Terrestrial Networks: Analysis and Comparison of HARQ and RLC ARQ Performance Over Satellite Links,” IEEE Access, vol. 13, pp. 75400–75415, 2025, doi: 10.1109/ACCESS.2025.3563983.
Z. Liu, Z. Yue, F. Li, Y. Yuan, and X. Guan, “Joint Optimization of Adaptive Time Slot Resource Segmentation and Route Scheduling with CQF Mechanism in Time-Sensitive Networks,” IEEE Trans Veh Technol, pp. 1–11, 2025, doi: 10.1109/TVT.2025.3611967.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ali Haider Abbas

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








