Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2023/2024
UPOZORNĚNÍ: Jsou dostupné studijní plány pro následující akademický rok.

Web Data Mining

Login to KOS for course enrollment Display time-table
Code Completion Credits Range Language
NI-DDW Z,ZK 5 2P+1C Czech
Garant předmětu:
Jaroslav Kuchař
Lecturer:
Jaroslav Kuchař
Tutor:
Milan Dojčinovski, Jaroslav Kuchař
Supervisor:
Department of Software Engineering
Synopsis:

Students will learn latest methods and technologies for web data acquisition, analysis and utilization of the discovered knowledge. Students will gain an overview of Web mining techniques for Web crawling, Web structure analysis, Web usage analysis, Web content mining and information extraction. Students will also gain an overview of most recent developments in the field of social web and recommendation systems.

Requirements:

Basic knowledge in Web architecture (HTTP, HTML, URI), programming skills (e.g. Java, JavaScript), graph theory and basic algorithms.

Syllabus of lectures:

1. Key web data mining principles.

2. Web content mining approaches (formats, restrictions, ethical aspects).

3. Web content mining tools.

4, Accessing and extracting specific web content (deep web).

5. Main text mining concepts.

6. Practical applications of text mining.

7. Social network structure and content analysis (2).

8. Web graph, web structure mining.

9. Web usage mining: data collecting.

10. Web usage mining: data analysis, web analytics.

11. Recommender systems and personalization.

12.Data stream mining: algorithms and applications.

Syllabus of tutorials:

1. Basics of data acquisition and processing

2. Text preprocessing, text mining applications

3. Acquisition and analysis of graph-based data

4. User data analysis

5. Basics of recommendation systems

6. Project presentation and assessment

Study Objective:

Provide students with an overview of web mining technologies and qualify them to use some of them in practice.

Study materials:

1. Liu, B. „Web Data Mining“, Springer-Verlag Berlin Heidelberg, 2011. ISBN 978-3-642-19459-7.

2. Charu C. Aggarwal. „Machine Learning for Text“, Springer, 2018. ISBN 9783319735313.

3. Easley, D., Kleinberg, J. „Networks, Crowds, and Markets: Reasoning About a Highly Connected World“, Cambridge

4. A. Russel, M. „Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More (3rd Edition)“, O'Reilly Media, 2019. ISBN 978-1491985045.

5. Charu C. Aggarwal. „Recommender Systems: The Textbook“, Springer, 2016. ISBN 9783319296579.

Note:
Further information:
https://courses.fit.cvut.cz/NI-DDW/
Time-table for winter semester 2023/2024:
Time-table is not available yet
Time-table for summer semester 2023/2024:
06:00–08:0008:00–10:0010:00–12:0012:00–14:0014:00–16:0016:00–18:0018:00–20:0020:00–22:0022:00–24:00
Mon
roomT9:349
Kuchař J.
09:15–10:45
(lecture parallel1)
Dejvice
NBFIT PC učebna
roomT9:349
Kuchař J.
12:45–14:15
ODD WEEK

(lecture parallel1
parallel nr.101)

Dejvice
NBFIT PC učebna
Tue
Wed
Thu
Fri
The course is a part of the following study plans:
Data valid to 2024-05-29
Aktualizace výše uvedených informací naleznete na adrese https://bilakniha.cvut.cz/en/predmet6119706.html