Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2023/2024

Informatics 4

The course is not on the list Without time-table
Code Completion Credits Range Language
155GIT4 ZK 5 3P Czech
Garant předmětu:
Lecturer:
Tutor:
Supervisor:
Department of Geomatics
Synopsis:

In the course, students are introduced to techniques how to handle big amount of data. The course starts with data preprocessing by command tools before import into DB. The focus is related to relation databases, NoSQL databases, ElasticSearch, R and cloud.

Requirements:

Informatika 2 and Informatika 3

Syllabus of lectures:

1. BigData - evolution and basic concepts

2. Data preprocessing by command line tools

3. Data preprocessing by command line tools 2

4. Relational SQL databases - indexes, partitioning, performance tuning, ACID

5. NoSQL database - concepts

6. NoSQL database - Apache Cassandra

7. NoSQL database - graph databases (Neo4j), document oriented databases

8. Cloud basics

9. Installation of NoSQL database into cloud - hands on redundancy, CAP Theorem

10. Apache ecosystem I: Hadoop, HBase, Sparc, Pig

11. ElasticSearch

12. Statistical language R

13. Statistical language R - in connection with Apache Sparc

Syllabus of tutorials:

1. BigData - evolution and basic concepts

2. Data preprocessing by command line tools

3. Data preprocessing by command line tools 2

4. Relational SQL databases - indexes, partitioning, performance tuning, ACID

5. NoSQL database - concepts

6. NoSQL database - Apache Cassandra

7. NoSQL database - graph databases (Neo4j), document oriented databases

8. Cloud basics

9. Installation of NoSQL database into cloud - hands on redundancy, CAP Theorem

10. Apache ecosystem I: Hadoop, HBase, Sparc, Pig

11. ElasticSearch

12. Statistical language R

13. Statistical language R - in connection with Apache Sparc

Study Objective:

Target is make students familiar with techniques and tools which can be used for processing large amount of data. Also students will have good understanding how NoSQL databases work.

Study materials:

:Apache Cassandra/Hadoop/HBase/Sparc/Pig - http://www.apache.org/

:Neo4j - https://neo4j.com/

:ElasticSearch - https://www.elastic.co/

:Language R - https://www.r-project.org

Note:
Further information:
https://geo.fsv.cvut.cz/gwiki/155YIN4_Informatika_4
No time-table has been prepared for this course
The course is a part of the following study plans:
Data valid to 2024-03-16
Aktualizace výše uvedených informací naleznete na adrese https://bilakniha.cvut.cz/en/predmet4249206.html