ZHANG Wenxian, DU Yongwen
(School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)
Abstract: To improve the quality of computation experience for mobile devices, mobile edge computing (MEC) is a promising paradigm by providing computing capabilities in close proximity within a sliced radio access network, which supports both traditional communication and MEC services. However, this kind of intensive computing problem is a high dimensional NP hard problem, and some machine learning methods do not have a good effect on solving this problem. In this paper, the Markov decision process model is established to find the excellent task offloading scheme, which maximizes the long-term utility performance, so as to make the best offloading decision according to the queue state, energy queue state and channel quality between mobile users and BS. In order to explore the curse of high dimension in state space, a candidate network is proposed based on edge computing optimize offloading (ECOO) algorithm with the application of deep deterministic policy gradient algorithm. Through simulation experiments, it is proved that the ECOO algorithm is superior to some deep reinforcement learning algorithms in terms of energy consumption and time delay. So the ECOO is good at dealing with high dimensional problems.
Key words: multi-user mobile edge computing; task offloading; deep reinforcement learning
References
[1]Xing M, Xi C. Cisco visual networking index: Global mobile data traffic forecast update, 2016-2021 white-paper. (2017-05-21)[2019-05-03]. http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-whitepaper-c11-520862.html.
[2]You C S, Huang K, Chae H, et al. Energy efficient mobile cloud computing powered by wireless energy transfer. IEEE Journal on Selected Areas in Communications, 2016, 34(5): 1757-1771.
[3]Narendra P M, Fukunaga K. A branch and bound algorithm for feature subset selection. IEEE Transactions on Computers, 1977, 26(9): 917-922.
[4]Bertsekas D P. Dynamic programming and optimal control. Athena Scientific Belmont, MA, 1995, 1(2):23-31.
[5]Bi S, Zhang Y J A. Computation rate maximization for wire-less powered mobile-edge computing with binary computation offloading. IEEE Transactions on Wireless Communications, 2018, 17(6): 4177-4190.
[6]Tran T X, Pompili D. Joint task offloading and resource allocation for multi-server mobile-edge computing networks. IEEE Transactions on Vehicular Technology, 2017, 1705(704): 332-330.
[7]Guo S, Xiao B, Yang Y, et al. Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing. IEEE Infocom 2016-IEEE Conference on Computer Communications. IEEE, 2016.
[8]Dinh T Q, Tang J, La Q D, et al. Offloading in mobile edge computing: Task allocation and computational frequency scaling. IEEE Transactions on Communications, 2017, 65(8): 3571-3584.
[9]Sutton R S, Barto A G. Reinforcement learning: An introduction. MIT press, New York, 1998: 35-40.
[10]Mnih V, Kavukcuoglu K, Silver D, Rusu A, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533.
[11]Wang C, Liang C, Yu F, et al. Computation offloading and resource allocation in wireless cellular networks with mobile edge computing. IEEE Transactions on Wireless Communications, 2017, 16(8): 4924-4938.
[12]Chen X, Jiao L, Li W, et al. Efficient multi-user computation offloading for mobile-edge cloud computing. IEEE/ACM Transactions on Networking, 2016, 24(5): 2795-2808.
[13]Zhang H, Liu H, Cheng J, et al. Downlink energy efficiency of power allocation and wireless backhaul bandwidth allocation in heterogeneous small cell networks. IEEE Transactions on Communications, 2018, 66(4): 1705-1716.
[14]Li M, Yang S, Zhang Z, et al. Joint subcarrier and power allocation for ofdma based mobile edge computing system. IEEE International Symposium on Personal. IEEE, 2017.
[15]Zhang H, Liu H, Cheng J, et al. Downlink energy efficiency of power allocation and wireless backhaul bandwidth allocation in heterogeneous small cell networks. IEEE Transactions on Communications, 2018, 66(4): 1705-1716.
[16]You C, Huang K, Chae H, et al. Energy-efficient resource allocation for mobile-edge computation offloading. IEEE Transactions on Wireless Communications, 2017, 16(3): 1397-1411.
[17]Sardellitti S, Scutari G, Barbarossa S, et al. Joint optimization of radio and computational resources for multicell mobileedge computing, IEEE Transactions on Signal & Information Processing Over Networks, 2015, 1(2): 89-103.
[18]Mao Y, Zhang J, Letaief K B, et al. Dynamic computation offloading for mobile-edge computing with energy harvesting devices. IEEE Journal on Selected Areas in Communications, 2016, 34(12): 3590-3605.
[19]Mao Y, Zhang J, Song S, et al. Stochastic joint radio and computational resource management for multi-user mobile-edge computing systems. IEEE Transactions on Wireless Communications, 2017, 16(9): 5994-6009.
[20]Suraweera H A, Tsiftsis T A, Karagiannidis G K, et al. Effect of feedback delay on amplify-and-forward relay networks with beamforming. IEEE Transactions on Vehicular Technology, 2011, 60(3): 1265-1271.
[21]Zhang Y, Liu H, Jiao H, et al. To offload or not to offload: An efficient code partition algorithm for mobile cloud computing. IEEE 1st International Conference on Cloud Networking, Paris, France. 2012: 100-107.
[22]Kwak J, Kim Y, Lee J, et al. Dream: Dynamic resource and task allocation for energy minimization in mobile cloud systems. IEEE Journal on Selected Areas in Communications, 2015, 33(12): 2510-2523.
[23]Miettinen A P, Nurminen J K. Energy efficiency of mobile clients in cloud computing. HotCloud, 2010, 10(1): 4.
[24]Burd T D, Brodersen R W. Processor design for portable systems. Kluwer Academic Publishers, 1996, 13(2-3): 203-221.
[25]Xu Z X, Cao L, Chen, et al. Deep reinforcement learning with sarsa and Q-learning: A hybrid approach. IEICE Transactions on Information and Systems, 2018, E101d(9): 2315-2322.
[26]Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning. Computer Science, 2016.
[27]Shortle J F, Thompson J M, Gross D, et al. Fundamentals of queueing theory. John Wiley & Sons, 2018, 399: 21-30.
[28]Adelman D, Mersereau A J. Relaxations of weakly coupled stochastic dynamic programs. Operations Research, 2008, 56(3): 712-727.
[29]Lepetit L, Strobel F. Bank insolvency risk and time-varying Z-score measures. Journal of International Financial Markets, Institutions and Money, 2013, 25: 73-87.
[30]Kingma D P, Ba J. Adam: A method for stochastic optimization. Computer Science, 2014.
[31]Lillicrap T P, Hunt J J, Pritzel A, et al. Conference on learning representations (ICLR), Caribe Hilton, San Juan, 2016: 12-25.
[32]Uhlenbeck G E, Ornstein L S. On the theory of the brownian motion. Physical review, 1930, 36(5): 823.
基于深度强化学习多用户移动边缘计算轻量任务卸载优化
张文献, 杜永文
(兰州交通大学 电子与信息工程学院, 甘肃 兰州 730070)
摘要:移动边缘计算(MEC)在提高移动设备的计算体验质量方面具有一定的应用前景。 它可以为支持传统通信和MEC服务的切片式无线接入网提供紧密邻近的计算功能。 然而, 这种密集计算问题是一种高维的NP难问题, 一些机器学习方法在解决该问题的时候不能取得良好的效果。 针对这些问题, 本文将最佳计算卸载问题建模为马尔可夫决策过程, 目标是最大化长期效用性能, 从而根据队列状态、 能量队列状态以及移动用户与BS之间的信道质量做出卸载决策。 为了降低状态空间中高维性的问题, 提出了应用深度确定性策略梯度的基于候选网络优化边缘计算优化卸载ECOO算法, 从而产生一种用于解决随机任务卸载的新型学习算法。 通过仿真实验证明, ECOO算法在能耗和时延方面优于一些深度强化学习算法, 在处理高维问题时效果更好。
关键词:多用户移动边缘计算; 任务卸载; 深度强化学习
引用格式:ZHANG Wenxian, DU Yongwen. Deep reinforcement learning-based optimization of lightweight task offloading for multi-user mobile edge computing. Journal of Measurement Science and Instrumentation, 2021, 12(4): 489-500. DOI: 10.3969/j.issn.1674-8042.2021.04.013
[full text view]