Logo ČVUT
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2023/2024
UPOZORNĚNÍ: Jsou dostupné studijní plány pro následující akademický rok.

Computers and Natural Language 2

The course is not on the list Without time-table
Code Completion Credits Range Language
01POPJ2 Z 2 0+2 Czech
Garant předmětu:
Lecturer:
Tutor:
Supervisor:
Department of Mathematics
Synopsis:

The goal of the course is to get acquainted with the broad topic of machine translation (MT). Machine translation is a challenging task that can serve as a good example for modeling of systems as complex as natural languages. We cover several rather different approaches to the task as well as issues related to automatic and manual evaluation of translation quality.

Requirements:
Syllabus of lectures:

1. Metrics of machine translation quality (both manual and automatic). 2. Translation and language models, generic log-linear model. Search space of partial hypotheses. Phrase-based translation. 3. Parallel texts, alignment and extraction of "translation dictionaries? from parallel data. 4. Morphological preprocessing, factored phrase-based translation. 5. Model optimization (minimum error rate training). 6. Constituency trees in MT, parsing-based MT. 7. Dependency trees in MT. 8. Deep-syntactic trees in MT. 9. Presentation of student's experiments.

Syllabus of tutorials:
Study Objective:

Knowledge of approaches to machine translation (statistical phrase-based and hierarchical, tree-based models, deep-syntactic machine translation), log-linear model and model optimization, search space of partial hypotheses, methods of manual and automatic MT evaluation.

Ability to apply one of the covered methods to real data. Ability to design an experiment and make use of large open-source tools to carry out the experiment. Ability to discuss the results and present them both in written and oral form. Ability to cooperate in a small team.

Study materials:

Key references:

Philipp Koehn: Statistical Machine Translation. Cambridge University Press. ISBN: 978-0521874151, 2009.

Note:
Further information:
No time-table has been prepared for this course
The course is a part of the following study plans:
Data valid to 2024-04-18
Aktualizace výše uvedených informací naleznete na adrese https://bilakniha.cvut.cz/en/predmet23047005.html