[White] to create a compiler Series 8 Scope and Life Span: block scope and function to achieve

Why introduce and act on the survival of it? Let's look at the following questions:

  • If you want to achieve functional capabilities, to upgrade the variable management mechanism;
  • The introduction of scoping mechanism, to ensure that references to variables point to the correct variable definitions;
  • Enhance the variable storage mechanism, not only the variable and its value simply thrown into a HashMap years, to manage its lifetime, reduce memory usage.

 Scope (Scope)

Scope refers to a range of functions in computer language variables, functions, etc. . I first saw this term or in the C Primier Plus, was on learning C language.

Let's look at the following code:

/*
scope.c
测试作用域。
 */
#include <stdio.h>

int a = 1;

void fun()
{
    a = 2;
    //b = 3;   //出错,不知道b是谁
    int a = 3; //允许声明一个同名的变量吗?
    int b = a; //这里的a是哪个?
    printf("in fun: a=%d b=%d \n", a, b);
}

int b = 4; //b的作用域从这里开始

int main(int argc, char **argv){
    printf("main--1: a=%d b=%d \n", a, b);

    fun();
    printf("main--2: a=%d b=%d \n", a, b);

    //用本地变量覆盖全局变量
    int a = 5;
    int b = 5;
    printf("main--3: a=%d b=%d \n", a, b);

    //测试块作用域
    if (a > 0){
        int b = 3; //允许在块里覆盖外面的变量
        printf("main--4: a=%d b=%d \n", a, b);
    }
    else{
        int b = 4; //跟if块里的b是两个不同的变量
        printf("main--5: a=%d b=%d \n", a, b);
    }

    printf("main--6: a=%d b=%d \n", a, b);
}

Output:

main--1: a=1 b=4 
in fun: a=3 b=3 
main--2: a=2 b=4 
main--3: a=5 b=5 
main--4: a=5 b=3 
main--6: a=5 b=5 

We can draw this rule:

  1. Scoped variables are of different sizes, external variables can be accessed within the function, and the function of local variables, can access only local.
  2. Scoped variables, starting from the later statement.
  3. In the function, we can declare variables with external variables of the same name, this time on the cover of the external variables.

In addition, C language as well as  block scope  of the concept, is surrounded by curly braces statement, if and else will follow later this block. Characterized in scope with the characteristic function block similar scope, can access external variables, external variables may be overwritten with local variables.

Block scope and different languages ​​are different. For example, Block scope Block scope of Java with C language is different, it does not allow block scope variables in the coverage of external variables. And JavaScript is not block scope.

These are semantic differences example. The scope of analysis is one of the semantic analysis tasks!

Survival (Extent)

In the previous example program, with the survival scope of the variable is consistent. Out of scope, the survival period is over, the memory occupied by variables will be released . This is a standard feature local variables, these local variables are used to manage the stack .

The following sample code fragment of C language in, fun function returns a pointer to an integer. A function later, the local variable b disappeared, the pointer memory occupied (& b) to recover, where & b is taken b the address that points to the stack in a small space, because b is the stack apply. In this small space in the stack to save an address, point in the application heap memory . This memory, which is used to actually save space value of 2, has not been recovered, we have to manually use the free () function to recover .

/*
extent.c
测试生存期。
 */
#include <stdio.h>
#include <stdlib.h>

int * fun(){
    int * b = (int*)malloc(1*sizeof(int)); //在堆中申请内存
    *b = 2;  //给该地址赋值2
   
    return b;
}

int main(int argc, char **argv){
    int * p = fun();
    *p = 3;

    printf("after called fun: b=%lu *b=%d \n", (unsigned long)p, *p);
 
    free(p);
}

Realize the scope and stack

Before writing simple compiler, we use a HashMap to record variable values in order to achieve by taking reference to the variable name. However, if there are multiple variables scope, this does not. At this time, we have to design a data structure, scope distinguish different variables.

We observed a variable scope, you can find him is actually a tree structure:

Object-oriented language is not the same, it is not a tree, a forest, a tree corresponding to each class, so it has no global variables. We designed the following object structure to represent Scope:

//编译过程中产生的变量、函数、类、块,都被称作符号
public abstract class Symbol {
    //符号的名称
    protected String name = null;

    //所属作用域
    protected Scope enclosingScope = null;

    //可见性,比如public还是private
    protected int visibility = 0;

    //Symbol关联的AST节点
    protected ParserRuleContext ctx = null;
}

//作用域
public abstract class Scope extends Symbol{
    // 该Scope中的成员,包括变量、方法、类等。
    protected List<Symbol> symbols = new LinkedList<Symbol>();
}

//块作用域
public class BlockScope extends Scope{
    ...
}

//函数作用域
public class Function extends Scope implements FunctionType{
    ...  
}

//类作用域
public class Class extends Scope implements Type{
    ...
}

We currently divided into three scopes, namely block scope (Block) , function scope (Function) and  class scope (Class) .

We AST parsing script execution time, you need to build a tree structure of its scope, the scope of the analysis process is part of semantic analysis, that is, not with AST we can execute, but before the execution, the need semantic analysis, such as analysis of the scope to do so each variable plays the correct reference.

Or to see Scope.c code, with the execution of the code, the lifetime of the performance of each variable as follows:

  • Enter the program, the global variables one by one into effect;
  • Enter the main function, main function of variables in order to take effect;
  • Function into the fun, fun function variables in the order of entry into force;
  • Quit function fun, fun in the variable function failure;
  • Enter if statement block, if a block of statements in the order of variables to take effect;
  • Exit if block, if the variable in the statement block failure;
  • Exit main function, main function of variables in failure;
  • Exit the program, the global variables fail.

Let's look at the stack runtime changes:

Enter and exit a scope of a code execution process, you can use  the stack  to achieve. Each entered a scope, go out into the stack pressed into a data structure, this data structure called a stack frame . Stack frame can hold values of the current scope of all local variables when exiting this scope when the stack frame was ejected, which variables also fails.

Block scope achieved

Currently, we are well scoped and stack, after that, we can achieve a lot of features, such as making the if statement and for the use of loops block scope and local variables.

When we need to get the value of a variable in your code when the first look at the current frame. Can not be found, then it is to the previous corresponding frames in scope to look for.

Achieve functional capabilities

In the function, we have to consider an additional factor: the parameter . Inside the function, parameter variables with the ordinary local variables when using no different, at runtime, they are like local variables, as stored in the stack frame inside.

When calling the function, we actually do the work of three steps:

  • Establish a stack frame;
  • All parameter values ​​calculated and placed in stack frame;
  • Perform the function body function declaration.

to sum up

  • Analysis of the scope of work is semantic analysis.
  • Stack frame can hold values ​​of the current scope of all local variables when exiting this scope when the stack frame was ejected, which variables also fails.

Reference Course: "Geek Time - compiler theory of beauty"

Published 62 original articles · won praise 34 · views 20000 +

Guess you like

Origin blog.csdn.net/weixin_41960890/article/details/105264833