Finite-state machines : the WFST project
Finite-state machines
Finite-state machines are simple computing models that have a finite memory or a finite number of states.
These models were introduced very early at the dawn of computer science.
Important applications include :
- conception of logical devices used in electronics
- telecommunication codes
- data compression
- pattern matching algorithms (genome sequencing, search engines)
- natural language processing (dictionary, speech recognition and synthesis)
WFST : a Finite-state template library in C++
During my graduation project, a small extension has been implemented
on top of ASTL library, the Automata Standard Template Library, by Vincent Le Maout, Marne-la-Vallée University, France. This extension was meant to be used in speech synthesis
but can be applied to all the fields above.
These finite-state machines have been implemented in C++ (EGCS, Linux):
- FSA, finite-state automata (deterministic and non-deterministic with epsilon)
- WFSA, weighted finite-state automata
- FST, finite-state transducers
- WFST, weighted finite-state transducers
We have written once and generically a series of operations
as template algorithms that can compile for almost each type of machines.
These operations include :
- compilation in a text format compatible with ATT's FSM library text file format.
- union, concatenation, closure
- intersection, completion, complementation, difference
- inversion, composition
- determinization, minimization (for automata and underlying automata)
- connection, epsilon-removal, reversal
- best path (Dijkstra)
- local grammar computation and application
- I/O operations : conversion to ATT GraphViz dot format.
For each library operations, an equivalent Unix filter has been designed so
that tests and experiments can be easily carried out by using standard shell constructions.
A generic program using these filters can be used to produce (weighted) regular relations from their expression.
See the user documentation.
Download
Please read carefully the README shipped with each package.
In order for all the examples to run, you will need a standard Linux configuration including tr, sed, perl 5.0, ATT GraphViz (visualization) and possibly FSM library as each example is supplied with
its FSM counterpart.
If you have noticed a bug in the library or incorrectness in an algorithm or for any other comments, feel free to contact me at adant@usa.net. Arnaud Adant 2000.