- Marcel Jiřina (guarantor)
- Daniel Vašata
- Department of Applied Mathematics
Students learn to prepare raw data for further processing and analysis. They learn what algorithms can be used to extract parameters from various data sources, such as images, texts, time series, etc., and learn the skills to apply these theoretical concepts to solve a specific problem in individual projects - e.g., parameter extraction from image data or from Internet.
Fundamentals of statistics, FCD course in data mining.
The recommended prerequisite is BIE-VZD.
- Syllabus of lectures:
1. Data exploration, exploratory analysis techniques, visualization of raw data.
2. Descriptive statistics.
3. Methods to determine the relevance of features.
4. Problems with data ? dimensionality, noise, outliers, inconsistency, missing values, non-numeric data.
5. Data cleaning, transformation, imputing, discretization, binning.
6. Reduction of data dimension.
7. Reduction of data volume, class balancing.
8. Feature extraction from text.
9. Feature extraction from documents, web. Preprocessing of structured data.
10. Feature extraction from time series.
11. Feature extraction from images.
12. Data preparation case studies.
13. Automation of data preprocessing.
- Syllabus of tutorials:
1. Assignment of course projects.
3. Presentation of course projects.
- Study Objective:
Data preprocessing is crucial for successful data processing and takes a lot of time - usually more than the data processing itself. Knowledge of algorithms for extraction of parameters from various data sources is a fundamental part of knowledge engineering,
- Study materials:
1. Pyle, D. ''Data Preparation for Data Mining''. Morgan Kaufmann, 1999. ISBN 1558605290.
2. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. A. ''Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)''. Springer, 2006. ISBN 3540354875.
- Further information:
- Time-table for winter semester 2019/2020:
Thákurova 7 (FSv-budova A)
NBFIT PC ucebna
NBFIT PC ucebna
- Time-table for summer semester 2019/2020:
- Time-table is not available yet
- The course is a part of the following study plans:
- Knowledge Engineering, in Czech, Presented in Czech, Version 2016 and and 2017 (compulsory course of the specialization)
- Computer Security, Presented in Czech, Version 2016 to 2019 (elective course)
- Computer Systems and Networks, Presented in Czech, Version 2016 to 2019 (elective course)
- Design and Programming of Embedded Systems, in Czech, Version 2016 to 2019 (elective course)
- Specialization Web and Software Engineering, in Czech, Version 2016 to 2019 (elective course)
- Specialization Software Engineering, in Czech, Version 2016 to 2019 (elective course)
- Specialization Web Engineering, Presented in Czech, Version 2016 to 2019 (elective course)
- Master Informatics, Presented in Czech, Version 2016 to 2019 (VO)
- Specialization System Programming, Presented in Czech, Version 2016 to 2019 (elective course)
- Specialization Computer Science, Presented in Czech, Version 2016-2017 (elective course)
- Knowledge Engineering, in Czech, Presented in Czech, Version 2018 to 2019 (compulsory course of the specialization)