Q-Learning based Dynamic Electromagnetic Spectrum Management: Design, Modeling, and Performance Evaluation

Yubo Xie; Qiwu Wu; Tao Tong; Jianyong Weng

doi:10.6919/ICJE.202512_11(12).0024

Authors

Yubo Xie
Qiwu Wu
Tao Tong
Jianyong Weng

DOI:

https://doi.org/10.6919/ICJE.202512_11(12).0024

Keywords:

Dynamic Electromagnetic Spectrum Management; Improved Q-Learning Algorithm; Markov Decision Process (MDP); Spectrum Parameter Optimization; OFDM; Clustered Devices.

Abstract

With the rapid development of wireless communication technologies and wide application of clustered devices (such as UAV swarms), the demand for electromagnetic spectrum resources has surged, while traditional static spectrum management struggles to adapt to dynamic complex electromagnetic environments, leading to low spectrum utilization, frequent inter-device interference, and difficulty meeting real-time communication needs. To address these issues, this study proposes a dynamic electromagnetic spectrum management method based on improved Q-Learning. A Markov Decision Process (MDP) model is constructed, with the state space defined by real-time spectrum channel occupancy, user equipment SINR, and device spatial coordinates; the action space includes channel selection and transmit power adjustment; the reward function balances spectrum utilization, interference reduction, and latency minimization. The Q-Learning algorithm is enhanced with dynamic learning rate adjustment and priority experience replay to optimize key parameters (channel switching latency, SINR threshold, transmit power stability). Comparative experiments with traditional static spectrum allocation show the proposed method improves spectrum utilization by 32.5%, reduces interference rate by 41.2%, and shortens switching latency by 28.8% on average. Additionally, integrating Markov decision logic into OFDM’s subcarrier allocation and power control improves its adaptability to dynamic spectra, enhancing spectral efficiency by 15.7% compared to traditional OFDM.

Downloads

Download data is not yet available.

References

[1] International Telecommunication Union (ITU). Report ITU-R M.2610-0: Spectrum Requirements for 6G Systems[R]. Geneva: ITU, 2024.

[2] Akyildiz I F, Kak M, Nie S. A survey on dynamic spectrum management for 6G and beyond networks[J]. Computer Networks, 2020, 179: 107384.

[3] Haykin S, Thielemans K. Cognitive radio and intelligent spectrum sharing for 6G[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(10): 2207-2224.

[4] Li X, Peng Z, Liang L. Off-Policy Q-Learning for Infinite Horizon LQR Problem with Unknown Dynamics[C]. Proc. IEEE 27th Int. Symp. Ind. Electron. (ISIE 2018), Cairns, Australia, 2018: 258-263.

[5] Li X, Wang Y, Zhang L. Improved Q-Learning with Adaptive Experience Replay for Dynamic Spectrum Management[J]. IEEE Transactions on Wireless Communications, 2021, 20(5): 3210-3223.

[6] Wang H, Chen S, Liu J. Markov Decision Process-Based Subcarrier Allocation for OFDM Systems in Dynamic Spectrum Environments[J]. IEEE Internet of Things Journal, 2023, 10(8): 6890-6902.

[7] Zhang, J., Liu, K., Zhang, Y. Deep Q-Network Based Spectrum Allocation for Cognitive Radio Networks[J]. IEEE Access, 2018, 6: 73250-73258.

[8] Dong C, Jing Y Q, Qu Y B, et al. Cloud-Edge-End Fusion Architecture for Spectrum Cognition and Decision in Low-Altitude Intelligent Network[J]. Journal on Communications, 2023, 44(11): 1-12.

[9] Sutton, R. S., Barto, A. G. Reinforcement Learning: An Introduction[M]. 2nd ed. Cambridge: MIT Press, 2018.

[10] Liu M, Zhao G, Sun X D. UAV Swarm Spectrum Allocation Technology Based on Deep Reinforcement Learning[J]. Acta Electronica Sinica, 2022, 50(7): 1650-1658.

[11] Zhang Y, Li W, Chen Z. Multi-Agent Q-Learning for Spectrum Sharing in UAV Swarms[J]. IEEE Transactions on Vehicular Technology, 2020, 69(11): 13200-13212.

[12] Zhao W, Chen X F, Liu W. Key Technologies of Dynamic Spectrum Access in 6G Terahertz Band[J]. Journal on Communications, 2023, 44(5): 1-15.

[13] Van der Veen A J, Paulraj A K. OFDM techniques for wireless communications[J]. IEEE Signal Processing Magazine, 1999, 16(3): 19-37.

[14] Yang F, Li J D, Wang Q. A Survey on Dynamic Subcarrier Allocation Techniques in OFDM Systems[J]. Journal of Electronics & Information Technology, 2020, 42(8): 1987-1998.