Practical Data Mining

The course is not on the list Without time-table

Code	Completion	Credits	Range	Language
MI-PDM	Z,ZK	5	2P+1C	Czech

Course guarantor:

Lecturer:

Tutor:

Supervisor:

Department of Applied Mathematics

Synopsis:

Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).

Requirements:

Fundamentals of algebra, statistics, programming

Syllabus of lectures:

1) Introduction and motivation

2) Decision trees

3) Clustering (K-means, hierarchical clustering)

4) K-NN

5) Naive Bayes

6) Linear regression

7) Logistic regression

8) Dimensionality reduction (SVD, PCA)

9) NLP (natural language processing)

Up to four lectures will be given by external speakers from the business.

Syllabus of tutorials:

1) Jupyter Notebook and panda, numpy, scikit-learn packages

2) Data visualisation

3) Decision trees

4) Clustering

5) Linear regression

6) PCA

Study Objective:

Study materials:

1. Larose, D. T. Discovering Knowledge in Data: An Introduction to Data Mining. Wiley-Interscience, 2004.

2. Hastie T.,Tibshirani R.,Friedman J., The Elements of Statistical Learning, Data Mining, Inference and Prediction, Springer, 2011

Note:

Further information:

https://courses.fit.cvut.cz/MI-PDM/

No time-table has been prepared for this course

The course is a part of the following study plans:

Master branch Web and Software Engineering, spec. Info. Systems and Management, in Czech, 2016-2019 (compulsory course of the branch)