Use Java to achieve a lambda interpreter

Lambda Interpreter

A λ-calculus interpreter
interpreter used Java by Pkun


We use a top-down way of thinking, first of all, the input is a lambda expression, for convenience, we will write lambda \, so enter the equation should be.
(\x.\y (x y)) (\p.p)(\q.q)
So how to explain it to change it? We first calculate by conventional methods. It is easy to see that we should just behind the front two abstract abstraction brought to it. If the computer to do so, the above sentence can be decomposed into

  • Three abstract identified
  • The abstraction in order bring back to the front of abstraction

Let's review what learned before lambda calculus knowledge

t1 t2   # Application
 
\x. t1  # Abstraction
 
x       # Identifier

If you want to generalize, the first step can be understood as

  • The computer must be able to translate all lambda expressions, namely App, Abs and Ide.

So how to translate all of the items it? We note that three of them have their own characteristics, Abs there lambda, Application there are two, Identifier only one value, on the other hand, note that we read is a string, so we can use to traverse the entire character from start to finish the method of string, all the items in the string are read out.
Here, we have to note one thing is Application, Abstraction can be nested with each other, Identifier Application may be about two, should be the last out of a tree structure, here, we can think of let three classes inherit a class node, node by node access node to traverse the whole tree, type nesting avoid trouble caused by conversion.

We require an abstract syntax tree, it has three nodes, abstract, applications and atoms.

As to traverse the entire string, we used to enumerate the way, once they meet certain pattern, he will be classified as some kind of syntax, so we need to provide all of the characters will appear in a lambda expression, in fact, very simple, Six such is

LPAREN: '('
RPAREN: ')'
LAMBDA: '\' // 为了方便使用 “\”
DOT: '.'
LCID: /[a-z][a-zA-Z]*/ 
EOF: null

We refer to these characters named Token, and by identifying the different Token, a Lambda expressions to parse into an abstract syntax tree. Here, we have constructed a computer can read the Lambda expressions. Next we let the computer during traversal, the entire building abstract syntax tree out. For convenience, we developed three rules of grammar:

Term ::= Application| LAMBDA LCID DOT Term

Application ::= Application Atom| Atom

Atom ::= LPAREN Term RPAREN| LCID

The overall idea is to keep the next character is read, it is determined which of the Token, then jumps of the three methods Term Application Atom, to build the entire tree.

Let us consider the second step, which is to do things beta statute. In fact, think about or very easy thing, because in our abstract syntax tree inside, if a node is left branch Abstraction, you can directly branch into the right-to-left support to go inside, replace the left branch of the body of All param and the same thing. Written in pseudo-code words should be like this

while(hasnext(Abs.Body)){
  if(charInBody==Abs.Param) charInBody = Ide;
}

I also think this is a better approach, but not a lot of details to consider, do not know how the final effect of this kind of approach. Logically, it will not happen because the same two Abs param lead to substitution problems. Next, I would like to introduce teacher Imperial way to the constructor evaluated.

De Bruijn index

About De Bruijn, you can be found at the site you want De Bruijn Sequence (words can not see Baidu Encyclopedia)
with De Bruijin Index, we can give items in Ide Abs or marked with their own label, such as \ x. \ yxy can be seen as \ x. \ y 1 0, but for \ xa when such variables are not binding, may be replaced by default \ x. 1 bar (if not wrong, then blog). De Bruijn Index printing the main reason is that the replacement alpha, lambda expression is free to replace the symbol, it is no specific meaning, is a pseudo variable, it does not matter even if repeated, of course, in a same abs among or some other formula, we have to distinguish between the variables and calculations or should take on a different name.

Of course, after the use of the De Bruijn Index, our previous phase there was a problem, we consider such an abstract

\x.\y.x (\x. x)

If you want to give the above abstractions De Bruijn conversion, then x number of inner layer should be how much? We say, after converting it should look something like this

\.\.1 (\. 0)

Because even if the outer layer of the same name and variable inner abs binding, but we think the name is irrelevant, so it should be recalculated De Bruijn value of the inner layer of abs.
Here we want to introduce the structure of a context memory, called ctx, we can use it the same as a good solution to the inner and outer layers param, De Bruijn different problem.

After using the De Bruijn Index, when you Alternatively, you may also be present in the top of the stack method, meaning that the replaced Abs have not been completely changed, this time, it is possible to produce a misunderstanding, such as follows:
Abs = \ X. \ YX = ..1
IDE = \ TA = \ T.1
after replacing Abs will be provided before this method is not out of the stack \ x. \ y. \ t.1 this time will be misleading this refers to the x 1 in the end of a finger or what? So we have to introduce some new ways to solve this problem. We \ t.1 ascending, rose \ t.2, and then brought into Abs, and then we consider the depth of the substitution, because only once into words we need only consider the outermost layer will not conflict, if that your values are replaced happened to be the outermost layer of value, which indicates that you need to your \ t.2 once ascending, rise without conflict, how many liters it? Be safe, we should rise to more than its depth, so it will not produce any misunderstanding. Finally, we start ascending again that once down, so that the entire replacement process is complete, it will not have any problems.

In this way, the basic principle of the whole lambda interpreter had to explain finished, the other is how to write the code, but the code is actually quite small difficulty, you do not even need to know is how to run the code behind, but step by step in accordance with the above write slowly knock them out, and finally by OJ, while at the time of writing, I found also did not relate to whether or not to misunderstand this thing (if I'm above ideas, then), so I wonder if the computer , it had the misunderstanding would not have it? And other free time I would use their own ideas rewrite code.

Guess you like

Origin blog.csdn.net/weixin_33777877/article/details/91026703