class: center, middle # CS-3110: Formal Languages and Automata ## Deterministic Finite Automata ### Chapter 3.4 --- class: center, middle # Automata --- # Automata * We want a formal ("mechanical") way to check if a word is in a language: An **automaton** (plural: automata) * Given a word, our automaton will say "yes" or "no", or "**accept**" (word is in the language) or "**reject**" (word is not in the language) * Ideally, we could define such an automaton for *any* language (e.g. "the language of all programs that contain no bugs") * Automata for some languages are simpler than others --- # Accepting Words Say we have a regular expression: $$ aa^\ast bb^\ast $$ How would you write a program that checks if a string is in the language defined by this expression? --- class: medium # Automata * We will actually start from the other direction: By defining a very restricted type of automata * Our automata will read a word character by character, starting from the left * When all characters are read, the automaton decides whether to accept or reject * Another way to think about it: The automaton stores some information (it has a "state"), and each input character can change that state. * The state of the automaton at the end of the word tells us whether to accept or reject --- class: mmedium # Finite State Automata * Let's start with the simplest type of automata * The automaton has a (finite) number of **states** * **Exactly one** state is always active, starting with a defined **start state** * The automaton reads the word character for character * Each character defines which state the automaton should **transition** to next * If, at the end of the word, the active state is one of a set of **accepting/final states**, the automaton returns "yes"/"accept", otherwise "no"/"reject" --- # Transition Graphs
--- class: medium # Transition Graphs * The automaton starts in an initial state * We keep track of which state we are currently in * When we read a symbol, we follow the arrow from our current state * If (and only if) we are in a final state at the end of the word, we **accept** that word
--- # Finite State Automata We also call this formalism **Deterministic Finite Automata** * Deterministic: There is no ambiguity about what to do * Finite: We only have a finite number of **states** * Automaton: It runs "automatically" (we can write an algorithm) Next week, we will see the "non-determinstic" variant --- # Definition of a Deterministic Finite Automaton A Deterministic Finite Automaton consists of: * A finite set of states Q * A finite alphabet `\(\Sigma\)` * A transition function `\(\delta: Q \times \Sigma \mapsto Q\)` * An initial state `\(q_0 \in Q\)` * A set of accepting states `\(F \subseteq Q\)` --- # Conversion between the two forms Let's look at our automaton again and define `\((Q, \Sigma, \delta, q_0, F)\)`
--- # An Example Let's design an automaton that recognizes words over the alphabet {a, b, c} such that: * Every "a" is followed by at least one "b" * Every "c" is preceded by at least one "b" * The last letter of the word is a "b" or a "c" --- # Extended Transition Function * We defined our transitions as a function `\(\delta\)` * However, we also want to talk about processing entire words * For this, it is helpful to define an extended transition function: $$ \delta^{\ast}: Q\times \Sigma^{\ast} \mapsto Q\\\\ \delta^{\ast}(q,\varepsilon) = q \\\\ \delta^{\ast}(q, w\cdot{}a) = \delta(\delta^{\ast}(q, w), a) $$ --- # Languages * Each Deterministic Finite Automaton can be used to recognize words from some language * The language that the automaton "understands" (accepts) is the set of all words for which we end in an accepting state
$$ L(M) = \{w \in \Sigma^{\ast}| \delta^{\ast}(q_0,w) \in F \}\\\\ $$
--- # Regular Languages * Any language for which there exists a Deterministic Finite Automaton that accepts that language is called "regular" * These are exactly the languages we can define using regular expressions! * These automata are easy to implement and efficient (how would you implement one?) * However, translating a regular expression directly to a Deterministic Finite Automaton is tricky --- class: medium # Limitations * Our automatons can recognize any regular language * However, we only ever use the current state and the next input symbol to decide what to do next * Our automatons do not have (arbitrary) memory, which limits what they can do * Most notable limitation: Can't check if two symbols have equal count, such as matching parenthesis, html tags, etc. --- # Implementation How would you implement a DFA? ```Java class DFA { int state = 0; boolean check(string word) { for (int i = 0; i < word.length(); ++i) { state = transition_function(state, word.charAt(i)); } return is_accepting(state); } } ``` --- # Implementation ```Java int transition_function(int state, char next) { if (state == 0 && next == '0') return 1; if (state == 0 && next == '1') return 0; if (state == 1 && next == '0') return 1; if (state == 1 && next == '1') return 0; } boolean is_accepting(int state) { if (state == 1) return true; return false; } ```