Data processing
Code  Completion  Credits  Range  Language 

14PDE  Z,ZK  6  2P+4C  English 
 Garant předmětu:
 Lecturer:
 Tutor:
 Supervisor:
 Department of Applied Informatics in Transportation
 Synopsis:

Students will learn about tools for data processing and analysis, using practical examples to try out the most common options used in data processing, including advanced options for presenting the results of analyses. In advanced methods, students will also perform specific analysis using Bayesian networks. Students will then independently perform data analysis on data from existing open systems.
 Requirements:

Ability to think logically, knowledge of the basics of algorithmization and the basics of any programming language at a level appropriate to the year of study at a technical university.
 Syllabus of lectures:

Part 1 introduces data processing tools and is divided into 3 blocks:
Block 1: introduction to R  environment, concept, basics, simple examples, basic libraries, examples and usage (students install R)
Block 2: applied R  applied examples from practice, map library, data retrieval from different sources and their modification (GIS, RDBMS, CSV, etc.)
Block 3: advanced R  interactive presentation module (shiny), other modules by agreement
Part 2 deals with a specific model for data processing, Bayesian networks and is also divided into 3 blocks:
Block 1: Basics of Bayesian networks, specialized software for Bayesian networks, modeling, basics of graph theory and probability.
Block 2: Preparing data for subsequent use of Bayesian networks, plotting the first Bayesian network, algorithms for network learning, parameters, inference; linking with GeNia.
Block 3: Performing inference in Bayesian networks.
 Syllabus of tutorials:

Part 1 introduces data processing tools and is divided into 3 blocks:
Block 1: introduction to R  environment, concept, basics, simple examples, basic libraries, examples and usage (students install R)
Block 2: applied R  applied examples from practice, map library, data retrieval from different sources and their modification (GIS, RDBMS, CSV, etc.)
Block 3: advanced R  interactive presentation module (shiny), other modules by agreement
Part 2 deals with a specific model for data processing, Bayesian networks and is also divided into 3 blocks:
Block 1: Basics of Bayesian networks, specialized software for Bayesian networks, modeling, basics of graph theory and probability.
Block 2: Preparing data for subsequent use of Bayesian networks, plotting the first Bayesian network, algorithms for network learning, parameters, inference; linking with GeNia.
Block 3: Performing inference in Bayesian networks.
 Study Objective:

The aim of the course is primarily to familiarize students with tools for data processing and analysis, to test the most common options used in data processing, including advanced options for presenting analysis results.
 Study materials:

Jan Rauch, Milan Šimůnek: Dobývání znalostí z databází, LISpMiner a GUHA. Praha: Oeconomica VŠE, 2014.
Petr Berka: Dobývání znalostí z databází. Praha: Academia, 2003.
Irena Holubová, Karel Minařík, David Novák, Jiří Kosek: Big Data a NoSQL databáze.
Arun K. Somani, Ganesh Chandra Deka: Big Data Analytics. CRC Press, 2017.
 Note:
 Further information:
 No timetable has been prepared for this course
 The course is a part of the following study plans:

 navaz. mag. PRE program IS joint degree 22/23 (nová akreditace) (compulsory course)
 navaz. mag. PRE program IS v EN 23/24 (compulsory course)
 navaz. mag. PRE program IS v EN 24/25 (compulsory course)
 navaz. mag. PRE program IS joint degree 24/25 (nová akreditace) (compulsory course)