Compiler-NFA to DFA

NFA to DAF: subset construction (subset construction method)

For a regular expression, although NFA and DFA can always meet the requirements of regular expressions from the input state to the final state, NFA can have infinitely many ways to go for a specific example, and DFA for a specific example There is only one way to go, so DFA is more conducive to converting programming into code

The core idea of ​​conversion is to regard the selection and combination of multiple possibilities in the NFA route as a DFA state

q1 → d1
q2 → d2
(q1,q2) → d3
∅ → d4

It can be seen that the 2 states of NFA can have up to 4 DFA states

If there are 3 NFA states, then there can be up to 8 DFA states: {q1,q2,q3} {q1} {q2} {q3} {q1,q2} {q1,q3
} {q2,q3} {q1, q2,q3} {∅}
It can be seen that these eight DFA states are all subsets of {q1,q2,q3}, so this is why it is called subset construction

Summary If there are n NFA states then there can be up to 2^Nmultiple DFA states

Next, look at the specific conversion process

First, q1 is the final state in NFA, then both q1 and E-closure of q1 can be used as the final state in DFA.
Next, you can find that q1 in the above figure has only output but no input. This state needs to be eliminated.  
If the start state is q1 in NFA, then Converted to DFA, it becomes E-closure of q1
⚠️: E-closure of q1 means that besides q1 itself, where can q1 go through E (empty), you can see that you can also go to q2, so the closure of q1 is {q1,q2}

Eventually the following DFA is formed

 

Just change the name of the state to:

 2^2It can be seen that this NFA has only two states, but in the first step, it is necessary to draw a state according to the standard subset construction method. We say at most 2^Nbecause we can see that some states may be eliminated during the process. 
If the NFA has 10 states state means at most 2 to the tenth power, that is, 1024 DFA states,
but generally speaking, you don’t need so many states. If you are lucky, you may be able to complete the DFA conversion with only a few states.
This complexity can be seen to be exponential. It doesn't matter because this is not done during the compilation process, this is done when the compiler is made, it can be said that it is a once-and-for-all job and will not be repeated

 Look at another example
a (b U c)*
NFA:

diagram:

DFA state NFA state x=a x=b x=c
d0 q0 d1 d∅ d∅
d1 {q1,q2,q3,q4,q6,q9} d∅ d2 d3
d∅ {} d∅ d∅ d∅
d2 {q3,q4,q5,q6,q8,q9} d∅ d2 d3
d3 {q3,q4,q6,q7,q8,q9} d∅ d2 d3

The starting position is E-closure of q0, that is, q0 itself is marked as d0. When q0 walks a to E-closure of q1, it is a set that needs to be set as d1. If both b and c fail to go, it means d is empty.

Because d1 and d empty appear, it is necessary to continue the list to indicate its direction

d1 is the E-closure of q1, which is the set {q1, q2, q3, q4, q6, q9}, there is no state in this set that can go through a, so if d is empty, then b in the set can go to E-closure in q4 of q5 is a set that needs to be set to d2. Similarly, if you go to c, q6 can go to E-closure. of q7 is a set that needs to be set to d3

d empty walk everything is d empty

d2 is the E-closure of q5, if it is impossible to go to a, set d to empty and go to b, then q4 can go to the E-closure of q5, that is, if d2 itself goes to c, then q6 can go to the E-closure of q7, which is d3

d3 is the E-closure of q7, if a can’t go through, set d to empty and go b, then q4 can go to the E-closure of q5, which is d2, if you go to c, q6 can go to the E-closure of q7, which is d3 itself

At this time, it is found that there is no new d set, and the list needs to be continued. It proves that DFA can be completed with the current table.

DFA:

 

 The start state here is E-closure of q0, which is q0 itself, that is, d0, and the 
final state is E-closure of q9, that is, all d sets containing q9 are considered to be d1 d2 d3

Guess you like

Origin blog.csdn.net/weixin_43754049/article/details/126213249