Reinforcement Learning

The course is not on the list Without time-table

Code	Completion	Credits	Range	Language
B4M36PSU	Z,ZK	6	2P+2C	Czech

Course guarantor:

Lecturer:

Tutor:

Supervisor:

Department of Computer Science

Synopsis:

Requirements:

Syllabus of lectures:

1. Motivation (successes, AGI, human feedback, history)

2. Multi-armed bandit problems (stochastic, contextual)

3. Solving MDPs 1: (Bellman equations, Value iteration)

4. Solving MDPs 2: (Contraction, Policy iteration)

5. Temporal difference learning 1: (TD(0), Sarsa, Q-learning)

6. Temporal difference learning 2: (n-step, Double-Q, DQN)

7. Policy gradient methods 1: (Tabular)

8. Policy gradient methods 2: (Variance reduction, Neural)

9. Combining learning and planning (AlphaZero, muZero)

10. Exploration in RL

11. Multi-agent RL (cooperative vs. adversarial)

12. Applications: Advertising, RLHF, Robotics,

13. Neuro-science and RL

Syllabus of tutorials:

Study Objective:

Study materials:

Jako primární materiál budou k dispozici online scripta (ne slidy).

Doporučená literatura:

Reinforcement Learning, second edition: An Introduction, Richard Sutton, Andrew G. Barto, 2018.

Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNsto PPO and RLHF, Maxim Lapan, 2020.

Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions, Warren B. Powel, 2022.

Note:

Further information:

No time-table has been prepared for this course

The course is a part of the following study plans: