Imperial College London

ProfessorDenizGunduz

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor in Information Processing
 
 
 
//

Contact

 

+44 (0)20 7594 6218d.gunduz Website

 
 
//

Assistant

 

Ms Joan O'Brien +44 (0)20 7594 6316

 
//

Location

 

1016Electrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Somuyiwa:2018:10.1109/ISWCS.2018.8491205,
author = {Somuyiwa, SO and Gunduz, D and Gÿorgy, A},
doi = {10.1109/ISWCS.2018.8491205},
title = {Reinforcement Learning for Proactive Caching of Contents with Different Demand Probabilities},
url = {http://dx.doi.org/10.1109/ISWCS.2018.8491205},
year = {2018}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - A mobile user randomly accessing a dynamic content library over a wireless channel is considered. At each time instant, a random number of contents are added to the library and each content remains relevant to the user for a random period of time. Contents are classified into finitely many classes such that whenever the user accesses the system, he requests each content randomly with a class-specific demand probability. Contents are downloaded to the user equipment (UE) through a wireless link whose quality also varies randomly with time. The UE has a cache memory of finite capacity, which can be used to proactively store contents before they are requested by the user. Any time contents are downloaded, the system incurs a cost (energy, bandwidth, etc.) that depends on the channel state at the time of download, and scales linearly with the number of contents downloaded. Our goal is to minimize the expected long-term average cost. The problem is modeled as a Markov decision process, and the optimal policy is shown to exhibit a threshold structure; however, since finding the optimal policy is computationally infeasible, parametric approximations to the optimal policy are considered, whose parameters are optimized using the policy gradient method. Numerical simulations show that the performance gain of the resulting scheme over traditional reactive content delivery is significant, and increases with the cache capacity. Comparisons with two performance lower bounds, one computed based on infinite cache capacity and another based on non-casual knowledge of the user access times and content requests, demonstrate that our scheme can perform close to the theoretical optimum.
AU - Somuyiwa,SO
AU - Gunduz,D
AU - Gÿorgy,A
DO - 10.1109/ISWCS.2018.8491205
PY - 2018///
SN - 2154-0217
TI - Reinforcement Learning for Proactive Caching of Contents with Different Demand Probabilities
UR - http://dx.doi.org/10.1109/ISWCS.2018.8491205
ER -