Knowledge Mining and Exploitation
Code | Completion | Credits | Range | Language |
---|---|---|---|---|
X33ZDV | Z,ZK | 4 | 2+2s | Czech |
- Lecturer:
- Filip Železný (gar.), Jiří Kléma
- Tutor:
- Filip Železný (gar.), Matěj Holec, Jan Hrdlička, Jiří Kléma, Ondřej Kuželka, Andrea Szabóová
- Supervisor:
- Department of Cybernetics
- Synopsis:
-
The course concentrates on methods of machine learning, data mining and knowledge representation. The techniques will mainly be demonstrated in bioinformatics applications.
- Requirements:
-
The obligatory class X33RZO running in parallel with X33ZDV is a co-requisite for the latter. For class credit, presence is required as specified by general study regulations, and submission of all four assignments is also required. The student may earn at most 50 points in the labs (see the labs description), and further 50 points at the exam. The grading is as follows:
80-100: 1
70-79: 2
60-69: 3
0-59: 4
- Syllabus of lectures:
-
Lectures
1. Introduction to machine learning and data mining
2. Graphical probabilistic models
3. Graphical probabilistic models II
4. Markov models and grammars
5. Probabilistic relational models
6. Hierarchical clustering, association discovery, the Apriori algorithm, frequent subgraph mining
7. Association discovery in relational data
8. Classification and PAC-learnability
9. Decision rules and trees (PAC learnability and heuristic approaches)
10. Model assessment and validation techniques
11. Relational decision rules, inductive logic programing
12. Relational decision trees
13. Text and web mining
14. Semantic web and ontologies
- Syllabus of tutorials:
-
1. Introduction, organization, relevant Matlab software demos (graphical models, hierarchical clustering)
2. The Weka data mining tool demo
3. The Aleph tool for inductive logic programming demo
4. Assignment 1
5. Individual work on assignment 1
6. Presentation of completed assignment 1, assignment 2
7. Individual work on assignment 2
8. Mid-term test [10 p]
9. Individual work on assignment 2
10. Presentation of completed assignment 2, assignment 3
11. Individual work on assignment 3
12. Presentation of completed assignment 3, assignment 4
13. Individual work on assignment 4
14. Presentation of completed assignment 4, credits
Assignment 1: Matlab - learning of / inference from a Markov model or a Bayesian network [10 p]
Assignment 2: Implementation of either of: the Apriori algorithm, hierarchical clustering, frequent subgraph discovery [10 p]
Assignment 3: Weka - an experiment design for model selection [10 p]
Assignment 4: Aleph - an ILP experiment design [10 p]
Late submissions: 3 points down for up to 1 week of delay; for longer delays the max score is 1 point.
- Study Objective:
- Study materials:
-
- T. Mitchell: Machine Learning, McGraw Hill, 1997.
- Hastie T., Tibshirani R., J Friedman: The Elements of
Statistical Learning:
Data Mining, Inference, and Prediction. Springer 2001
- Dzeroski S., Lavrac N.: Relational Data Mining, Springer 2001
- Note:
- Further information:
- No time-table has been prepared for this course
- The course is a part of the following study plans:
-
- Biomedical Engineering- structured studies (compulsory elective course)