Practical Data Mining
Code | Completion | Credits | Range | Language |
---|---|---|---|---|
MI-PDM | Z,ZK | 5 | 2P+1C | Czech |
- Course guarantor:
- Lecturer:
- Tutor:
- Supervisor:
- Department of Applied Mathematics
- Synopsis:
-
Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).
- Requirements:
-
Fundamentals of algebra, statistics, programming
- Syllabus of lectures:
-
1) Introduction and motivation
2) Decision trees
3) Clustering (K-means, hierarchical clustering)
4) K-NN
5) Naive Bayes
6) Linear regression
7) Logistic regression
8) Dimensionality reduction (SVD, PCA)
9) NLP (natural language processing)
Up to four lectures will be given by external speakers from the business.
- Syllabus of tutorials:
-
1) Jupyter Notebook and panda, numpy, scikit-learn packages
2) Data visualisation
3) Decision trees
4) Clustering
5) Linear regression
6) PCA
- Study Objective:
- Study materials:
-
1. Larose, D. T. Discovering Knowledge in Data: An Introduction to Data Mining. Wiley-Interscience, 2004.
2. Hastie T.,Tibshirani R.,Friedman J., The Elements of Statistical Learning, Data Mining, Inference and Prediction, Springer, 2011
- Note:
- Further information:
- https://courses.fit.cvut.cz/MI-PDM/
- No time-table has been prepared for this course
- The course is a part of the following study plans:
-
- Master branch Web and Software Engineering, spec. Info. Systems and Management, in Czech, 2016-2019 (compulsory course of the branch)