Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2019/2020

Practical Data Mining

The course is not on the list Without time-table
Code Completion Credits Range Language
MI-PDM Z,ZK 5 2P+1C Czech
Lecturer:
Tutor:
Supervisor:
Department of Applied Mathematics
Synopsis:

Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).

Requirements:

Fundamentals of algebra, statistics, programming

Syllabus of lectures:

1) Introduction and motivation

2) Decision trees

3) Clustering (K-means, hierarchical clustering)

4) K-NN

5) Naive Bayes

6) Linear regression

7) Logistic regression

8) Dimensionality reduction (SVD, PCA)

9) NLP (natural language processing)

Up to four lectures will be given by external speakers from the business.

Syllabus of tutorials:

1) Jupyter Notebook and panda, numpy, scikit-learn packages

2) Data visualisation

3) Decision trees

4) Clustering

5) Linear regression

6) PCA

Study Objective:
Study materials:

1. Larose, D. T. Discovering Knowledge in Data: An Introduction to Data Mining. Wiley-Interscience, 2004.

2. Hastie T.,Tibshirani R.,Friedman J., The Elements of Statistical Learning, Data Mining, Inference and Prediction, Springer, 2011

Note:
Further information:
https://courses.fit.cvut.cz/MI-PDM/
No time-table has been prepared for this course
The course is a part of the following study plans:
Data valid to 2019-10-18
For updated information see http://bilakniha.cvut.cz/en/predmet5720106.html