Informatics 4
Code | Completion | Credits | Range | Language |
---|---|---|---|---|
155YIN4 | Z,ZK | 4 | 2P+2C | Czech |
- Garant předmětu:
- Jan Pytel
- Lecturer:
- Tutor:
- Supervisor:
- Department of Geomatics
- Synopsis:
-
In the course, students are introduced to techniques how to handle big amount of data. The course starts with data preprocessing by command tools before import into DB. The focus is related to relation databases, NoSQL databases, ElasticSearch, R and cloud.
- Requirements:
-
Informatika 2 a Informatika 3
- Syllabus of lectures:
-
1. BigData - evolution and basic concepts
2. Data preprocessing by command line tools
3. Data preprocessing by command line tools 2
4. Relational SQL databases - indexes, partitioning, performance tuning, ACID
5. NoSQL database - concepts
6. NoSQL database - Apache Cassandra
7. NoSQL database - graph databases (Neo4j), document oriented databases
8. Cloud basics
9. Installation of NoSQL database into cloud - hands on redundancy, CAP Theorem
10. Apache ecosystem I: Hadoop, HBase, Sparc, Pig
11. ElasticSearch
12. Statistical language R
13. Statistical language R - in connection with Apache Sparc
- Syllabus of tutorials:
-
1. BigData - evolution and basic concepts
2. Data preprocessing by command line tools
3. Data preprocessing by command line tools 2
4. Relational SQL databases - indexes, partitioning, performance tuning, ACID
5. NoSQL database - concepts
6. NoSQL database - Apache Cassandra
7. NoSQL database - graph databases (Neo4j), document oriented databases
8. Cloud basics
9. Installation of NoSQL database into cloud - hands on redundancy, CAP Theorem
10. Apache ecosystem I: Hadoop, HBase, Sparc, Pig
11. ElasticSearch
12. Statistical language R
13. Statistical language R - in connection with Apache Sparc
- Study Objective:
-
Target is make students familiar with techniques and tools which can be used for processing large amount of data. Also students will have good understanding how NoSQL databases work.
- Study materials:
-
:Apache Cassandra/Hadoop/HBase/Sparc/Pig - http://www.apache.org/
:Neo4j - https://neo4j.com/
:ElasticSearch - https://www.elastic.co/
:Language R - https://www.r-project.org
- Note:
- Further information:
- https://geo.fsv.cvut.cz/gwiki/155YIN4_Informatika_4
- No time-table has been prepared for this course
- The course is a part of the following study plans:
-
- Geodézie a kartografie, specializace Geomatika (compulsory elective course)