Data science
Code | Completion | Credits | Range |
---|---|---|---|
01DAS | KZ | 3 | 1P+2C |
- Course guarantor:
- Jiří Franc
- Lecturer:
- Jiří Franc
- Tutor:
- Jiří Franc
- Supervisor:
- Department of Mathematics
- Synopsis:
-
Practical application of mathematical modeling methods, statistics and machine learning needs wide range of tasks from data preparation and collection to design of an appropriate method and its division into units for development and implementation into the production. Last, but not least, the cooperation in group and management of a modern data project is crucial. The actual standard of required tools will be presented on lectures. Further, these procedures will be applied during exercises with an emphasis on team collaboration, project planning. At the end of the course, students will present their results to other teams.
- Requirements:
- Syllabus of lectures:
-
1.Introduction to cooperative tools and tools for data project management. Effective division of work and responsibilities in a team.
2.Introduction of development environments for data preparation and statistical modeling, data profiling.
3.Application of the most used methods of machine learning on real data (decision trees, random forests, neural networks, clustering).
4.Design and evaluation of machine learning model. Dividing the data sample into training and testing, cross validation methods, error metrics in the context of the task.
5.Data preparation, solving the problem of incomplete data, feature engineering.
6.Manage extensive practical tasks from data cleansing, model designs and selections to validation and application.
7.Presentation and defense of selected and implemented solution in front of other students.
- Syllabus of tutorials:
- Study Objective:
- Study materials:
-
Key references:
[1] J. VanderPlas: Python Data Science Handbook: Essential Tools for Working with Data, O'Reilly (2016).
[2] G. James, D. Witten, T. Hastie, R. Tibshiran: An Introduction to Statistical Learning, Springe, 8th edition (2017).
Recommended references:
[3] T. Hastie, R. Tibshiran, J. Friedman: The Elements of Statistical Learning - Data Mining, Inference, and Prediction, Springer, 12th edition (2017).
- Note:
- Time-table for winter semester 2024/2025:
- Time-table is not available yet
- Time-table for summer semester 2024/2025:
- Time-table is not available yet
- The course is a part of the following study plans:
-
- Aplikované matematicko-stochastické metody (compulsory elective course)
- Aplikace informatiky v přírodních vědách (elective course)
- Jaderná a částicová fyzika (elective course)