Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2019/2020

Data Mining

Login to KOS for course enrollment Display time-table
Code Completion Credits Range Language
BI-VZD Z,ZK 4 2P+2C Czech
Lecturer:
Daniel Vašata, Karel Klouda
Tutor:
Daniel Vašata, Klára Hájková, Karel Klouda
Supervisor:
Department of Applied Mathematics
Synopsis:

Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, multidimensional data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).

Requirements:

The knowledge of calculus, linear algebra and probability theory is assumed.

Syllabus of lectures:

1. Introduction to the field and applications

2. Decision trees, test, train, validation set

3. Ensemble methods (random forest, AdaBoost)

4. Hierarchical clustering, k-means algorithm

5. kNN (k-nearest neighbours)

6. Naive Bayes

7. Linear regression

8. Logistic regression

9. Ridge regression and regularisation

10. Dimensionality reduction

11. Neural networks

12. Natural language processing

Syllabus of tutorials:

1. Jupyter notebooks and machine learning packages

2. Decision trees, hyperparameters tuning

3. Ensemble methods (random forest, AdaBoost)

4. Hierarchical clustering, k-means algorithm

5. kNN (k-nearest neighbours), cross-validation

6. Naive Bayes classifier

7. Linear regression

8. Logistic regression

9. Ridge regression

10. Dimensionality reduction

11. Neural networks

12. Natural language processing

Study Objective:

The module aims to introduce students to a rapidly developing field - knowledge discovery in data.

Study materials:

1. Data Mining: Practical Machine Learning Tools and Techniques, I. H. Witten, E. Frank, M. A. Hall, Elsevier, 2011, ISBN 978-0080890364.

2. Deep Learning, I. Goodfellow, Y. Bengio, A. Courville, MIT Press, 2016, ISBN 978-0262035613.

3. Machine Learning: A Probabilistic Perspective, K. P. Murphy, MIT Press, 2012, ISBN 978-0262018029.

Note:
Further information:
https://courses.fit.cvut.cz/BI-VZD/
Time-table for winter semester 2019/2020:
06:00–08:0008:00–10:0010:00–12:0012:00–14:0014:00–16:0016:00–18:0018:00–20:0020:00–22:0022:00–24:00
Mon
Tue
Fri
Thu
roomT9:107
Klouda K.
Vašata D.

14:30–16:00
(lecture parallel1)
Dejvice
Posluchárna
roomT9:348
Hájková K.
16:15–17:45
(lecture parallel1
parallel nr.101)

Dejvice
NBFIT PC ucebna
roomT9:348
Hájková K.
18:00–19:30
(lecture parallel1
parallel nr.102)

Dejvice
NBFIT PC ucebna
roomT9:350
Klouda K.
Vašata D.

16:15–17:45
(lecture parallel1
parallel nr.103)

Dejvice
NBFIT PC ucebna
Fri
roomTH:A-1142
Klouda K.
Vašata D.

09:15–10:45
(lecture parallel1
parallel nr.104)

Thákurova 7 (FSv-budova A)
Apple lab
Time-table for summer semester 2019/2020:
Time-table is not available yet
The course is a part of the following study plans:
Data valid to 2019-12-09
For updated information see http://bilakniha.cvut.cz/en/predmet1126006.html