一、报告时间
2025年9月26日(星期五)15:30
二、报告形式
经管楼101报告厅
三、报告人
吴贤毅 华东师范大学统计学教授
四、报告主题
An introduction to Reinforcement Learning and Empirical Gittins index strategies with ε-explorations
五、报告摘要
This talk consists of two parts: The first is an concise introduction to Reinforcement Learning. The other is on a Gittins index strategies in a special RL models: Markovian MAB model: For Markovian-reward MABs, the optimal policy pulls the arm with the highest Gittins index. When distributions are unknown, an empirical ε-Gittins index policy is proposed that combines ε-exploration with empirical Gittins indices computed via the Largest-Remaining-Index algorithm on estimated models. The theoretical results include the convergence of the empirical indices to true indices; expected discounted returns of the empirical ε-Gittins policy to those of the oracle Gittins policy.
六、报告人简介
吴贤毅,华东师范大学统计学院教授、博士生导师,研究及教学内容涉及统计学、机器学习/人工智能、非寿险精算学、随机调度等领域,在国际主流学术杂志发表过学术论文80余篇,在国内外出版社出版过专著两部,教材一部,内容包括变量选择、多重比较、强化学习、随机准备金评估以及老年护理保险等内容。