Applied Data Analysis
Code | Completion | Credits | Range | Language |
---|---|---|---|---|
18AAD | Z | 3 | 1P+1C | Czech |
- Course guarantor:
- Jaromír Kukal
- Lecturer:
- Tomáš Hubínek, Jaromír Kukal, Karel Šimánek
- Tutor:
- Tomáš Hubínek, Jaromír Kukal, Karel Šimánek
- Supervisor:
- Department of Software Engineering
- Synopsis:
-
A practically focused subject that guides you through the topics of Big Data, neural networks, parallel computing, graph analysis, cloud technologies, deployment, and development of software or IoT solutions.
- Requirements:
- Syllabus of lectures:
-
1. Big Data & Data Science: Use modern data acquisition, processing, and evaluation tools. Data storage and its use (SQL, NoSQL, object databases, full-text databases). Data storage formats and their suitability.
2. Frameworks for parallel calculations (MapReduce, Spark). Stream data processing. Commercial and free tools, examples of use.
3. Neural networks: Use of neural networks in practice. A recapitulation of developments and the necessary theoretical basis. Overview of available tools for their optimization/inference.
4. Use of GPU for acceleration and parallelization of calculations. Comparison with classical machine learning tools. Demonstration of image processing, NLP, or time series prediction.
5. Analysis of graphs: Analytical calculations over graphs on real examples. Recapitulation of graph theory and basic algorithms. Page Rank, search for influencers, statistics over deep organizational structures.
6. Use of the Pregel framework and the GraphX tool. Use graph databases and languages for efficient querying (GraphQL/Cipher). Graph visualization (e.g., in PowerBI, Neo4j, etc.).
7. Data Processing and AI in the Cloud: Comparing Cloud and On-premise. PaaS and IaaS. Infrastructure as a Code. Analytical tools in the Cloud (analytical tools, databases, frameworks, interfaces). Hands-on use of the Cloud to solve real problems.
8. Deployment of data and AI solutions in corporate IT: Deployment of data solutions in production. Project methodologies (Scrum/Agile/Waterfall). Way of managing projects/teams (DevOps/DataOps).
9. DevOps Support Tools. Configuration Management – how and where to maintain code, control versioning, and correct and track defects. Release Management – automated deployment of infrastructure and code.
10. Internet of Things: IoT and its use in practice. How to build a data solution based on IoT sensors. IoT solution providers.
- Syllabus of tutorials:
- Study Objective:
-
The subject aims to acquaint students with how to apply their theoretical foundations in practice to real problems and how (data) applications are developed.
- Study materials:
-
1.Hadoop: The Definitive Guide, Autor Tom White
2.Deep Learning, Autor Ian Goodfellow, Yoshua Bengio, Aaron Courville
3.Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, Autor Aurélien Géron
4.Effective DevOps: Building a Culture of Collaboration, Affinity, and Tooling at Scale, Autor: Jennifer Davis, Ryn Daniels
- Note:
- Time-table for winter semester 2024/2025:
- Time-table is not available yet
- Time-table for summer semester 2024/2025:
- Time-table is not available yet
- The course is a part of the following study plans:
-
- Aplikované matematicko-stochastické metody (elective course)
- Aplikace informatiky v přírodních vědách (elective course)