Automata in Text Pattern Matching
Code | Completion | Credits | Range | Language |
---|---|---|---|---|
MI-AVY | Z,ZK | 4 | 2P+1C | Czech |
- Lecturer:
- Ondřej Guth (guarantor), Tomáš Pecka, Štěpán Plachý, Radomír Polách, Jan Trávníček, Jan Žďárek
- Tutor:
- Ondřej Guth (guarantor), Tomáš Pecka, Štěpán Plachý, Radomír Polách, Jan Trávníček, Jan Žďárek
- Supervisor:
- Department of Theoretical Computer Science
- Synopsis:
-
Students learn algorithms for searching in texts using finite automata and in trees using tree automata. They become acquainted with the searching problems taxonomy and learn the principles of automata constructions for solving these problems. They will be able to apply the gained knowledge in the design of applications required text pattern matching (such as data streaming, DNA sequencing, etc.) and in trees.
- Requirements:
-
Basics of formal language theory and finite automata.
- Syllabus of lectures:
-
1. Finite automata, basic operations with finite automata. Taxonomy of pattern matching problems for exact and approximate matching. Forward pattern matching, models of searching algorithms. Nondeterministic search automata.
2. Simulation of nondeterministic search automata, bit parallelism, dynamic programming.
3. Deterministic search finite automata and their complexity.
4. Construction of prefix and suffix automata. Construction of factor automata, factor and suffix oracles. Computation of borders and periods of text. Searching exact and approximate repetitions in text.
5. Searching other string regularities and other indexing automata applications.
6. Finite automata and generalised strings: indexing, regularities.
7. Backward pattern matching.
8. Classes of deterministic and nondeterministic pushdown automata. Determinisation of pushdown automata.
9. Tree automata.
10. Tree pattern matching & indexing, non-linear tree patterns.
11. Tree regular expressions.
- Syllabus of tutorials:
- Study Objective:
-
The module deals with automata models of algorithms for text and tree searching. The main topics are text pattern matching and repetitions. In both cases, both exact and inexact pattern matchings are considered. The fundamental formal tools for description of the algorithms are the finite automaton for text searching and tree automaton for tree searching. The knowledge gained in this module can be applied in analysis and design of algorithms for text and tree pattern matching.
- Study materials:
-
1. Melichar, B., Holub, J., Polcar, T. ''Text searching algorithms''. Volume I and II, Lecture notes. Prague, CTU, 2008.
2. Melichar, B., et al. ''Text searching algorithms''. Seminars. Prague, CTU, 2008.
- Note:
- Further information:
- https://courses.fit.cvut.cz/MI-AVY
- Time-table for winter semester 2020/2021:
- Time-table is not available yet
- Time-table for summer semester 2020/2021:
- Time-table is not available yet
- The course is a part of the following study plans:
-
- Knowledge Engineering, in Czech, Presented in Czech, Version 2016 and and 2017 (elective course)
- Computer Systems and Networks, Presented in Czech, Version 2016 to 2019 (elective course)
- Design and Programming of Embedded Systems, in Czech, Version 2016 to 2019 (elective course)
- Specialization Web and Software Engineering, in Czech, Version 2016 to 2019 (elective course)
- Specialization Software Engineering, in Czech, Version 2016 to 2019 (elective course)
- Specialization Web Engineering, Presented in Czech, Version 2016 to 2019 (elective course)
- Master Informatics, Presented in Czech, Version 2016 to 2019 (VO)
- Specialization System Programming, Presented in Czech, Version 2016 to 2019 (elective course)
- Specialization Computer Science, Presented in Czech, Version 2016-2017 (compulsory course of the branch)
- Knowledge Engineering, in Czech, Presented in Czech, Version 2018 to 2019 (elective course)