ucc compiler analysis and summary (3) declaration check

With the basic knowledge of the type system and symbol management, the next step is to analyze the code in the declaration check part.

The source code is pre-compiled to generate the .i file. At this time, the code is mainly divided into two parts: function statement and declaration statement. In fact, a function is just a special statement statement, which has more {} content than the statement statement. Function statements and non-function statements are checked separately:

if (p->kind == NK_Function)
{
    
    
    CheckFunction((AstFunction)p);
}
else
{
    
    
    assert(p->kind == NK_Declaration);
    CheckGlobalDeclaration((AstDeclaration)p);
}

Let's first analyze the declaration check of the global definition, which is the content in the CheckGlobalDeclaration() function.

1 Basic type check

At the beginning, CheckDeclarationSpecifiers(decl->specs) is called to check the declaration type. In addition to the type check, it also checks some modifiers such as const, static, and so on. The type obtained after the inspection will be assigned to specs->ty. If there are modifiers, it needs to be combined into a new type. The basic code is as follows

static void CheckDeclarationSpecifiers(AstSpecifiers specs)
{
    
    
    ... ...
    //storage-class-specifier:      extern  , auto, static, register, ... 
    tok = (AstToken) specs->stgClasses;
    if (tok)
    {
    
    
        ...
        specs->sclass = tok->token;
    }
    //type-qualifier:   const, volatile
    tok = (AstToken) specs->tyQuals;
    while (tok)
    {
    
    
        qual |= (tok->token == TK_CONST ? CONST : VOLATILE);
        tok = (AstToken) tok->next;
    }
    //type-specifier:   int,double, struct ..., union ..., ...
    p = specs->tySpecs;
    while (p)
    {
    
    
        //结构体类型
        if (p->kind == NK_StructSpecifier || p->kind == NK_UnionSpecifier)
        {
    
    
            ty = CheckStructOrUnionSpecifier((AstStructSpecifier) p);
            tyCnt++;
        }
        else if (p->kind == NK_EnumSpecifier)//枚举类型
        {
    
    
            ty = CheckEnumSpecifier((AstEnumSpecifier) p);
            tyCnt++;
        }
        else if (p->kind == NK_TypedefName)//typedef重定义类型
        {
    
    
            ...
        }
        else
        {
    
    
            //基本类型
            tok = (AstToken) p;

            switch (tok->token)
            {
    
    
            case TK_SIGNED:
            case TK_UNSIGNED:
                sign = tok->token;
                signCnt++;
                break;

            case TK_SHORT:
            case TK_LONG:
                if (size == TK_LONG && sizeCnt == 1)
                {
    
    
                    size = TK_LONG + TK_LONG;
                }
                else
                {
    
    
                    size = tok->token;
                    sizeCnt++;
                }
                break;

            case TK_CHAR:
                ty = T(CHAR);
                tyCnt++;
                break;

            case TK_INT:
                ty = T(INT);
                tyCnt++;
                break;

            ...
            }
        }
        p = p->next;
    }
    ...

    //组合修饰符类型并返回
    specs->ty = Qualify(qual, ty);
    return;
}

2 Structure type check

When the structure type is encountered, it will continue to be processed in the CheckStructOrUnionSpecifier() function. There are 4 types of structures, namely

  1. struct Data1 // There is a structure name but no "curly braces"
  2. struct {int a; int b;} // no structure name but "curly braces"
  3. struct Data2{int a; int b;} // There are structure names and "curly brackets"
  4. struct // There is no structure name and no "curly braces", an error has been reported during syntax analysis

If the structure has a name, it will first check whether the name has been saved in the symbol table. If not, then create a new structure type and save it in the symbol table. The code is as follows

tag = LookupTag(stSpec->id);
if (tag == NULL)
{
    
    
    ty = StartRecord(stSpec->id, categ);
    tag = AddTag(stSpec->id, ty,&stSpec->coord);
}
else if (tag->ty->categ != categ)
{
    
    
    Error(&stSpec->coord, "Inconsistent tag declaration.");
}

Next, the member variables of the structure will be declared and checked one by one

	while (stDecl)
	{
    
    
        CheckStructDeclaration(stDecl, ty);
        stDecl = (AstStructDeclaration)stDecl->next;
	}

Similar to the check of ordinary declaration statements, first check the declaration type, and then check the declaration identifier

CheckDeclarationSpecifiers(stDecl->specs);
...
/**
 struct Data{
 int c, d;           
 }
 */
while (stDec)
{
    
    
    //结构体类型,成员变量,成员类型
    CheckStructDeclarator(rty, stDec, stDecl->specs->ty);
    stDec = (AstStructDeclarator)stDec->next;
}

The check of member identifiers is similar to the check of global identifiers. They both call CheckDeclarator() and DeriveType() to complete the construction of the composite type. This will be analyzed later. After each member type is constructed, AddField() will be called to remove the members. Insert into the bit field link list of the structure type, see section 1.2 of the previous article, the relevant code is as follows:

static void CheckStructDeclarator(Type rty, AstStructDeclarator stDec, Type fty)
{
    
    
    char *id = NULL;
    int bits = 0;

    // 如果空,可能出现位域中没有名字的情况,例如int :4;
    if (stDec->dec != NULL)
    {
    
    
        CheckDeclarator(stDec->dec);
        id = stDec->dec->id;
        fty = DeriveType(stDec->dec->tyDrvList, fty, &stDec->coord);
    }

    //...

    AddField(rty, id, fty, bits);
}

3 Check of declaration identifier

This is done by recursively calling the CheckDeclarator() function. When constructing the syntax tree, the basic identifiers may be compounded together with types such as array pointers, so the recursion is terminated when the NK_NameDeclarator node is encountered. The code is as follows

static void CheckDeclarator(AstDeclarator dec)
{
    
    
    switch (dec->kind)
    {
    
    
    case NK_NameDeclarator:
        break;

    case NK_ArrayDeclarator:
        CheckArrayDeclarator((AstArrayDeclarator) dec);
        break;

    case NK_FunctionDeclarator:
        CheckFunctionDeclarator((AstFunctionDeclarator) dec);
        break;

    case NK_PointerDeclarator:
        CheckPointerDeclarator((AstPointerDeclarator) dec);
        break;

    default:
        assert(0);
    }
}

Take the array as an example. First call CheckDeclarator() recursively to check the child node, and then check the length of the array after the check is completed, and finally insert the node into the head of the tyDrvList linked list. The code is as follows

//  int arr[4];
static void CheckArrayDeclarator(AstArrayDeclarator arrDec)
{
    
    
    CheckDeclarator(arrDec->dec);
    /**
     struct Data{
     ....
     int a[];    ----> legal.    when  arrDec->expr is NULL.
     }
     */
    if (arrDec->expr)
    {
    
    
        if ((arrDec->expr = CheckConstantExpression(arrDec->expr)) == NULL)
        {
    
    
            Error(&arrDec->coord,
                    "The size of the array must be integer constant.");
        }
    }

    ALLOC(arrDec->tyDrvList);
    arrDec->tyDrvList->ctor = ARRAY_OF;
    arrDec->tyDrvList->len = arrDec->expr ? arrDec->expr->val.i[0] : 0;
    arrDec->tyDrvList->next = arrDec->dec->tyDrvList;
    arrDec->id = arrDec->dec->id;
}

Pointer nodes and function nodes are also similar. The following figure illustrates the parsing process of int *arr1[5] and int (*arr2)[5].
Insert picture description here
After constructing the tyDrvList linked list, the DeriveType() function will traverse the tyDrvList linked list and compare The declaration type is combined into a composite type. The base class points to the declaration type as shown
Insert picture description here
in the figure below. After the type is constructed, it will check whether the identifier is in the symbol table. If not, add the identifier name and type to the symbol table together.

		/**
			Check for global variables
		*/
		if ((sym = LookupID(initDec->dec->id)) == NULL)
		{
    
    
    sym = AddVariable(initDec->dec->id, ty, sclass,&initDec->coord);
		}

Let's look at the semantic check of the function again. It is implemented in CheckFunction(). Similarly, the function declaration type (return type) and identifier (function name) must be checked. The composite type is combined by DeriveType() and added to the function symbol table. The function parameters have been placed in the ((FunctionType)ty)->sig->params vector in the CheckFunctionDeclarator() function, and then the function parameter list is added to the symbol table through AddVariable(). After the declaration is checked, the next step is to check whether the function statement is legal, by calling CheckCompoundStatement(func->stmt), which will be analyzed in detail in the next article.

Guess you like

Origin blog.csdn.net/pfysw/article/details/96564631