Reinforcement Learning
| Code | Completion | Credits | Range | Language |
|---|---|---|---|---|
| B4M36PSU | Z,ZK | 6 | 2P+2C | Czech |
- Course guarantor:
- Lecturer:
- Tutor:
- Supervisor:
- Department of Computer Science
- Synopsis:
- Requirements:
- Syllabus of lectures:
-
1. Motivation (successes, AGI, human feedback, history)
2. Multi-armed bandit problems (stochastic, contextual)
3. Solving MDPs 1: (Bellman equations, Value iteration)
4. Solving MDPs 2: (Contraction, Policy iteration)
5. Temporal difference learning 1: (TD(0), Sarsa, Q-learning)
6. Temporal difference learning 2: (n-step, Double-Q, DQN)
7. Policy gradient methods 1: (Tabular)
8. Policy gradient methods 2: (Variance reduction, Neural)
9. Combining learning and planning (AlphaZero, muZero)
10. Exploration in RL
11. Multi-agent RL (cooperative vs. adversarial)
12. Applications: Advertising, RLHF, Robotics,
13. Neuro-science and RL
- Syllabus of tutorials:
- Study Objective:
- Study materials:
-
Jako primární materiál budou k dispozici online scripta (ne slidy).
Doporučená literatura:
Reinforcement Learning, second edition: An Introduction, Richard Sutton, Andrew G. Barto, 2018.
Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNsto PPO and RLHF, Maxim Lapan, 2020.
Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions, Warren B. Powel, 2022.
- Note:
- Further information:
- No time-table has been prepared for this course
- The course is a part of the following study plans:
-
- Open Informatics - Artificial Intelligence (PS)
- Open Informatics - Computer Vision (compulsory elective course)
- Open Informatics - Data Science (elective course)