Logo ČVUT
Loading...
CZECH TECHNICAL UNIVERSITY IN PRAGUE
STUDY PLANS
2011/2012

Text Algorithms

The course is not on the list Without time-table
Code Completion Credits Range
XD36TAL Z,ZK 4 14+4s
Lecturer:
Tutor:
Supervisor:
Department of Computer Science and Engineering
Synopsis:

Text is the simplest and natural representation of information in different

areas. Text is a linear sequence of symbols from some alphabet. The

manipulation of text is used in many application areas: processing of text in natural and formal languages, study of sequences in molecular biology, music analysis, etc. The basic problems covered by this lecture is string matching and repetitions finding. In both cases, exact and approximate matching is taken into account. These problems are of the main interest in lectures and seminars. The main formal system used for the description of respective algorithms are finite automata. They appears as a very useful tool not only for understanding but for solving of many problems as well.

Requirements:
Syllabus of lectures:

1. Fultext systems, basic notions, classification of searching problems.

2. Forward searching, models of searching algorithms.

3. Nondeterministic searching automata.

4. Simulation of nondeterministic searching automata, bit parallelism, dynamic programming.

5. Prefix and suffix automata.

6. Factor automata, factor oracle, suffix oracle.

7. Borders and periods in text.

8. Repetitions in text, exact and approximate.

9. Simulation of nondeterministic searching automata, fail function, MP and KMP algorithms.

10. Simulation of nondeterministic searching automata, fail function, AC algorithms.

11. Backward matching of one pattern, BM algorithm.

12. Backward match of set of patterns, CW algorithm.

13. Actual results in text searching algorithms (for example searching in two-dimensional text).

14. Actual results in text searching algorithms (for example searching in compressed text).

Syllabus of tutorials:

1. Finite automata, basic operations with finite automata.

2. Formulation of basic searching problems for exact and approximate matching.

3. Nondeterministic finite automata as models for text searching.

4. Deterministic searching automata and their complexity.

5. Simulation of searching automata, bit parallelism, dynamic programming.

6. Construction of prefix and suffix automata.

7. Construction of factor automata, factor and suffix oracle automata.

8. Looking for borders and periods in text.

9. Looking for exact and approximate repetitions in text.

10. Simulation of searching automata, MP and KMP algorithms.

11. Simulation of searching automata, AC algorithm.

12. Backward pattern matching, BM algorithm.

13. Backward pattern matching, CW algorithm.

14. Assessment.

Study Objective:
Study materials:

1. Melichar, B., Holub, J., Polcar, T.: Text searching algorithms. Volume I,

II. Tutorial for lectures.

2. Melichar, B. a kol.: Text searching algorithms. Seminars. Collection of examples for seminars.

Note:
Further information:
No time-table has been prepared for this course
The course is a part of the following study plans:
Generated on 2012-7-9
For updated information see http://bilakniha.cvut.cz/en/predmet12041604.html