Machine Learning and Data Analysis
Code  Completion  Credits  Range  Language 

M33SAD  Z,ZK  6  2P+2C  Czech 
 Lecturer:
 Tutor:
 Supervisor:
 Department of Cybernetics
 Synopsis:

The course explains machine learning methods helpful for getting insight into data by automatically discovering interpretable data models such as graph and rulebased. The course will also address a theoretical framework explaining why/when the explained algorithms can in principle be expected to work.
The lectures are given in English.
 Requirements:

Topics contained in course A4B33RPZ.
For details see http://cw.felk.cvut.cz/doku.php/courses/m33sad/start
 Syllabus of lectures:

1. Course introduction. Cluster analysis  foundations (kmeans, hierarchical and EM clustering).
2. Cluster analysis  advanced methods (spectral clustering).
3. Cluster analysis  special methods (conceptual and semisupervised clustering, coclustering).
4. Frequent itemset mining. the Apriori algorithm, association rules.
5. Frequent sequence mining. Episode rules. Sequence models.
6. Frequent subtrees and subgraphs.
7. Dimensionality reduction.
8. Computational learning theory  intro, PAC learning.
9. Computational learning theory (cont'd).
10. PAClearning logic forms.
11. Learning in predicate logic.
12. Infinite Concept Spaces.
13. Empirical testing of hypotheses.
14. Wrapping up (if 14 lectures).
 Syllabus of tutorials:

1. Entry test (prerequisite course RPZ). SW tools for machine learning (RapidMiner, WEKA).
2. Data preprocessing, missing and outlying values, clustering.
3. Hierarchical clustering, principal component analysis.
4. Spectral cluestering.
5. Frequent itemset mining, association rules
6. Frequent sequence/subgraph mining.
7. Test (first half of the course). Learning Curve.
8. Underfitting and overfitting, ensemble classification, error estimates, crossvalidation.
9. Model selection and assessment, ROC analysis.
10. Project work.
11. Project work.
12. Inductive logic programming: the Aleph system.
13. Statistical relational learning: the Alchemy system.
14. Credits.
 Study Objective:

Learn principles of selected methods of data analysis methods and classifier learning, and elements of learning theory.
 Study materials:

T. Mitchell: Machine Learning, McGraw Hill, 1997
P. Langley: Elements of Machine Learning, Morgan Kaufman 1996
T. Hastie et al: The elements of Statistical Learning, Springer 2001
 Note:
 Further information:
 http://cw.felk.cvut.cz/doku.php/courses/a4m33sad/start
 No timetable has been prepared for this course
 The course is a part of the following study plans: