(Close to) Optimal Policies for Finite Horizon Restless Bandits - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2021

(Close to) Optimal Policies for Finite Horizon Restless Bandits

Résumé

Most restless Markovian bandits problems in infinite horizon can be solved quasioptimally: the famous Whittle index policy is known to become asymptotically optimal exponentially fast as the number of arms grows, at least under certain conditions (including having a so-called indexable problem). For restless Markovian bandits problems in finite horizons no such optimal policy is known. In this paper, we define a new policy, based on a linear program relaxation of the finite horizon problem (called the LP-filling policy), that is asymptotically optimal under no condition. Furthermore we show that for regular problems (defined in the paper) the LP-filling policy becomes an index policy (called the LP-regular policy) and becomes optimal exponentially fast in the number of arms. We also introduce the LP-update policy that significantly improves the performance compared to the LP-filling policy for large time horizons. We provide numerical studies that show the prevalence of our LP-policies over previous solutions.
Fichier principal
Vignette du fichier
LP_paper.pdf (555.5 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03262307 , version 1 (18-06-2021)
hal-03262307 , version 2 (11-05-2022)
hal-03262307 , version 3 (13-09-2022)
hal-03262307 , version 4 (21-12-2023)

Identifiants

Citer

Nicolas Gast, Bruno Gaujal, Chen Yan. (Close to) Optimal Policies for Finite Horizon Restless Bandits. 2021. ⟨hal-03262307v1⟩
233 Consultations
421 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More