Data Preprocessing
Kód | Zakončení | Kredity | Rozsah | Jazyk výuky |
---|---|---|---|---|
MIE-PDD | Z,ZK | 4 | 2+1 |
- Přednášející:
- Pavel Kordík (gar.)
- Cvičící:
- Pavel Kordík (gar.)
- Předmět zajišťuje:
- katedra teoretické informatiky
- Anotace:
-
Students learn to prepare raw data for further processing and analysis. They learn what algorithms can be used to extract parameters from various data sources, such as images, texts, time series, etc., and learn the skills to apply these theoretical concepts to solve a specific problem in individual projects - e.g., parameter extraction from image data or from Internet.
- Požadavky:
-
Fundamentals of statistics, FCD course in data mining.
- Osnova přednášek:
-
1. Data exploration, exploratory analysis techniques, visualization of raw data.
2. Descriptive statistics.
3. Methods to determine the relevance of features.
4. Problems with data ? dimensionality, noise, outliers, inconsistency, missing values, non-numeric data.
5. Data cleaning, transformation, imputing, discretization, binning.
6. Reduction of data dimension.
7. Reduction of data volume, class balancing.
8. Feature extraction from text.
9. Feature extraction from documents, web. Preprocessing of structured data.
10. Feature extraction from time series.
11. Feature extraction from images.
12. Data preparation case studies.
13. Automation of data preprocessing.
- Osnova cvičení:
-
1. Assignment of course projects.
2. Consultations.
3. Presentation of course projects.
- Cíle studia:
-
Data preprocessing is crucial for successful data processing and takes a lot of time - usually more than the data processing itself. Knowledge of algorithms for extraction of parameters from various data sources is a fundamental part of knowledge engineering,
- Studijní materiály:
-
1. Pyle, D. ''Data Preparation for Data Mining''. Morgan Kaufmann, 1999. ISBN 1558605290.
2. Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. A. ''Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing)''. Springer, 2006. ISBN 3540354875.
- Poznámka:
-
Rozsah=prednasky+proseminare+cviceni2p+1c, Prednasejici: Ing. Pavel Kordík Ph.D.
- Rozvrh na zimní semestr 2011/2012:
- Rozvrh není připraven
- Rozvrh na letní semestr 2011/2012:
- Rozvrh není připraven
- Předmět je součástí následujících studijních plánů:
-
- Knowledge Engineering, Presented in English, Version for Students, who Enrolled in 2010 and 2011 (povinný předmět oboru)
- Master Informatics, Presented in English - Version for Students who Enrolled in 2010 (VO)
- Master Informatics, Presented in English - Version for Students who Enrolled in 2011 (VO)
- Master Informatics, Presented in English - Version for Students who Enrolled in 2012 (VO)
- Knowledge Engineering, Presented in English - Version for Students who Enrolled in 2012 (povinný předmět oboru)