Logo ČVUT
Loading...
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2011/2012

Knowledge Mining and Exploitation

The course is not on the list Without time-table
Code Completion Credits Range Language
X33ZDV Z,ZK 4 2+2s Czech
Lecturer:
Filip Železný (gar.), Jiří Kléma
Tutor:
Filip Železný (gar.), Matěj Holec, Jan Hrdlička, Jiří Kléma, Ondřej Kuželka, Andrea Szabóová
Supervisor:
Department of Cybernetics
Synopsis:

The course concentrates on methods of machine learning, data mining and knowledge representation. The techniques will mainly be demonstrated in bioinformatics applications.

Requirements:

The obligatory class X33RZO running in parallel with X33ZDV is a co-requisite for the latter. For class credit, presence is required as specified by general study regulations, and submission of all four assignments is also required. The student may earn at most 50 points in the labs (see the labs description), and further 50 points at the exam. The grading is as follows:

80-100: 1

70-79: 2

60-69: 3

0-59: 4

Syllabus of lectures:

Lectures

1. Introduction to machine learning and data mining

2. Graphical probabilistic models

3. Graphical probabilistic models II

4. Markov models and grammars

5. Probabilistic relational models

6. Hierarchical clustering, association discovery, the Apriori algorithm, frequent subgraph mining

7. Association discovery in relational data

8. Classification and PAC-learnability

9. Decision rules and trees (PAC learnability and heuristic approaches)

10. Model assessment and validation techniques

11. Relational decision rules, inductive logic programing

12. Relational decision trees

13. Text and web mining

14. Semantic web and ontologies

Syllabus of tutorials:

1. Introduction, organization, relevant Matlab software demos (graphical models, hierarchical clustering)

2. The Weka data mining tool demo

3. The Aleph tool for inductive logic programming demo

4. Assignment 1

5. Individual work on assignment 1

6. Presentation of completed assignment 1, assignment 2

7. Individual work on assignment 2

8. Mid-term test [10 p]

9. Individual work on assignment 2

10. Presentation of completed assignment 2, assignment 3

11. Individual work on assignment 3

12. Presentation of completed assignment 3, assignment 4

13. Individual work on assignment 4

14. Presentation of completed assignment 4, credits

Assignment 1: Matlab - learning of / inference from a Markov model or a Bayesian network [10 p]

Assignment 2: Implementation of either of: the Apriori algorithm, hierarchical clustering, frequent subgraph discovery [10 p]

Assignment 3: Weka - an experiment design for model selection [10 p]

Assignment 4: Aleph - an ILP experiment design [10 p]

Late submissions: 3 points down for up to 1 week of delay; for longer delays the max score is 1 point.

Study Objective:
Study materials:

- T. Mitchell: Machine Learning, McGraw Hill, 1997.

- Hastie T., Tibshirani R., J Friedman: The Elements of

Statistical Learning:

Data Mining, Inference, and Prediction. Springer 2001

- Dzeroski S., Lavrac N.: Relational Data Mining, Springer 2001

Note:
Further information:
No time-table has been prepared for this course
The course is a part of the following study plans:
Generated on 2012-7-9
For updated information see http://bilakniha.cvut.cz/en/predmet11629104.html