Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2023/2024

DB Technologies for Big Data

Login to KOS for course enrollment Display time-table
Code Completion Credits Range Language
BI-BIG.21 KZ 5 2P+2C Czech
Garant předmětu:
Monika Borkovcová, Josef Gattermayer
Lecturer:
Monika Borkovcová, Josef Gattermayer
Tutor:
Monika Borkovcová, Josef Gattermayer, Jan Matoušek
Supervisor:
Department of Software Engineering
Synopsis:

Students will be introduced into the field of Big Data processing where nonrelational (NoSQL) database engines are typically used today. The course is focused practically so that after finishing the course students were able to choose suitable tools (mostly open source) and techniques,design and implement a simplest reproducible method of data processing (data collection, transformation/aggregation, presentation). Students get acquainted with various architectures for processing and storing big data. A theoretical foundation and presentation of individual technologies will be supplemented with specific case studies.

Requirements:

Basic knowledge of relational databases, working with the command line.

Syllabus of lectures:

1. Introduction to the Big Data processing, the definition of the Big Data concept, CAP theorem.

2. Case study.

3. [2] Column-oriented database engines (Cassandra).

5. Document-oriented database engines (MongoDB).

6. [2] Platforms for Big Data processing based on maintaining data in a file system (Hadoop).

8. [2] Platforms for Big Data processing based on maintaining data in main memory (Spark).

10. Indexing of unstructured and semistructured data (ElasticSearch, Solr).

11. Tools for data visualization and presentation (Kibana).

12. [2] Case studies.

Syllabus of tutorials:

1. Introduction to the laboratory environment

2. Introduction to working with Cassandra Cluster

3. Hadoop MapReduce

4. Cassandra UseCase 1 - Part 1

5. Cassandra UseCase 1 - Part 2

6. Cassandra UseCase 2 - Part 1 (Hive / Pig Use)

7. Cassandra UseCase 2 - Part 1

8. Cassandra UseCase 3 - Part 1 (Use Solr)

9. Cassandra UseCase 3 - Part 2

10. Cassandra UseCase 4 - Part 1 (Complex solution)

11. Cassandra UseCase 4 - Part 2

12. Submission of semester work, credit

13. Reserve

Study Objective:
Study materials:

Zikopoulos, Paul, and Chris Eaton. Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, 2011.

Note:
Further information:
https://courses.fit.cvut.cz/BI-BIG/
Time-table for winter semester 2023/2024:
06:00–08:0008:00–10:0010:00–12:0012:00–14:0014:00–16:0016:00–18:0018:00–20:0020:00–22:0022:00–24:00
Mon
Tue
Wed
Thu
roomT9:345
Borkovcová M.
Matoušek J.

09:15–10:45
(lecture parallel1
parallel nr.101)

Dejvice
NBFIT BOU ucebna
roomTH:A-s134
Borkovcová M.
16:15–17:45
(lecture parallel1)
Thákurova 7 (budova FSv)
As134
roomT9:345
Borkovcová M.
Matoušek J.

11:00–12:30
(lecture parallel1
parallel nr.102)

Dejvice
NBFIT BOU ucebna
roomT9:345
Borkovcová M.
Matoušek J.

12:45–14:15
(lecture parallel1
parallel nr.103)

Dejvice
NBFIT BOU ucebna
roomT9:345
Borkovcová M.
Matoušek J.

14:30–16:00
(lecture parallel1
parallel nr.104)

Dejvice
NBFIT BOU ucebna
Fri
Time-table for summer semester 2023/2024:
Time-table is not available yet
The course is a part of the following study plans:
Data valid to 2023-08-30
Aktualizace výše uvedených informací naleznete na adrese https://bilakniha.cvut.cz/en/predmet6608206.html