Compilation principle runtime environment

Compilation principle runtime environment

Runtime storage organization overview

The compiler translates the algorithm description part and data description part of the source program intomachine object codeanddata storage unit, and finally obtain the target program.

The target program is inTarget machine environmentWhen running, all are in one of their ownRuntime storage space. When running on an operating system, the target program willRun and store data within its own logical address space. When the compiler generates code, it is responsible for clarifying how various objects are stored in the logical address space, and how to use the logical address space when the target code is running .

During the compilation process,The object address allocation of the source program is often an offset relative to the running storage space. The object access is performed using the "base address + offset" addressing method, so that any available area of ​​the memory can be selected as the storage area when the target program is running.. The object code generated in this way is calledfloating address code

Note:
"Base address" refers to the first address of the running storage space

Run storage allocation policy

During the work of the compiler, it mustData objects allocate runtime storage space

  • For those data objects whose size can be determined at compile time, storage space can be allocated for them at compile time. This allocation strategy is calledstatic storage allocation
  • On the other hand, if the size of the data object cannot be completely determined at compile time, useDynamic storage allocationstrategy. That is, only various necessary information is generated during compilation, andrunning time, and then dynamically allocate the storage space of the data object
    • stack storage allocation
    • Heap storage allocation
  • in,staticanddynamiccorresponding respectivelycompile timeandrunning time

The mission and role of the runtime storage organization

The size of the code generated by the compiler is usually fixed, generally stored in a dedicated area, namelycode area;
During the running of the target program, it is necessary toData objects created and accessed are stored in the data area

The layout of storage space when the program is runningInsert image description here

Activity record

useA language in which procedures (or functions, methods) serve as the unit of user-defined actions, and its compiler usually allocates storage space in units of procedures.
Each execution of the procedure body is called aactivity
Each time a process is executed, a continuous storage area is allocated to it to manage the information required for one execution of the process. This continuous storage area is calledactivation record

General form of activity record

Activity records generally include the following content

  • Arguments
  • return value
  • Chain of control: activity record pointing to the caller
  • Access chain: used to access non-local data stored in other activity records
  • Saved machine state
  • local data
  • Temporary variables

stack storage allocation

some languages ​​useProcedure, function or method as unit of user-defined action, almost all compilers for these languages ​​include (at least part of) theirRun-time storage is managed in the form of a stack, which is called stack storage allocation.

  • When a procedure is called, the activity record of the procedure ispush onto stack; When the process ends, the activity record ispop stack
  • This arrangement is not onlyAllows space to be shared between multiple procedure calls with non-overlapping active periods,andAllows compiling code for a procedure as follows
    • The relative address of non-local variables is always fixed
    • Procedure call sequence independent

activity tree

The tree used to describe the control entry and exit of various activities during the running of the program is called an activity tree.

  • each in the treeNodecorresponds to aActivityroot nodeIt is executed by the startup programmain process activities
  • On a node representing an activity of process p, itschild nodeCorresponding to this activity being ptransferactivities of each process. follow theseThe order in which activities are called, displaying them from left to right.A child node must end before the activity of its right sibling node begins.

control stack

  • eachactive activitiesThere is one located incontrol stackneutralActivity record
  • activity treeroot's activity is located atbottom of stack
  • The record of the activity where program control is located (i.e. the current activity) is at the top of the stack
  • in stackSequence of all activity recordscorresponds to the activity treeThe path to the active node where the current control is located

Some principles for designing activity records

Values ​​passed between the caller and callee are generally placed at the beginning of the callee's activity record so that they are as close as possible to the caller's activity record.
Fixed-length items are placed in the middle:Control link, access chain, machine status word
Items whose size is not known early on are placed at the end of the active record in the top of stack
pointer register.top_spPoints to the location where local data in the active record begins, using this location as the base address
Insert image description here

Call sequence and return sequence

Both procedure calls and procedure returns require some code to be executedManage activity record stack, save or restore machine statewait

  • Call sequence: A section of code that implements procedure calls. Allocate space in the stack for an active record and fill in information in the fields of this record
  • Return sequence: restore the machine state so that the calling process can continue execution after the call ends
  • The code in a calling code sequence is usually divided into the calling process (caller) and the called process (callee). The same goes for the return sequence
    image

calling sequence

  • callerCalculate the value of the actual parameter
  • callerWillreturn address(the value of the program counter) is placed in the callee's machine status field. WillOriginal top-sp valuePut it in the callee's chain of control. Then,Increase the value of top-sp, making it point to the calleeThe position where local data starts
  • calleekeepRegister valueandOther status information
  • calleeInitialize its local data and start execution

return sequence

  • calleeWillreturn valuePlace it adjacent to the parameter
  • Using the information in the machine status field,calleeWillRestore top-spandOther registers,Thenredirect toThe return address placed by the caller in the machine status field
  • Although top-sp has been reduced (restored),callerStill know the position of the return value relative to the current top-sp value (located in the next active record,Although it has popped up at this time, the data is still valid). Therefore, the caller can use that return value

Storage allocation of variable-length data

existmodern programming languagein, inObject whose size cannot be determined at compile timewill be allocated inHeap area. However, if they arelocal object of procedure, or they can be allocated inruntime stackmiddle. Reasons to place objects in the stack area as much as possible: Yesavoidto their spacesGarbage collection, which reduces the correspondingoverhead

There is only one data objectlocal to a process,andWhen this process ends it becomes inaccessible, you can use the stack to allocate space for this object

Non-local data access

In addition to using the process itself, a process can uselocal dataIn addition, you can also use procedures defined outside thenon-local data

Languages ​​can be divided into two types

  • Languages ​​that support nested declarations of procedures
    • A procedure can be declared within another procedure
    • A process is not defined by itselflocal dataandglobal definitionIn addition to the data, you can also useperipheral processObject declared in
      Example: Pascal
  • Languages ​​that do not support nested procedure declarations
    • You cannot declare a procedure within another procedure
    • The data used in the process is either local data defined by itself, or global data defined outside all processes.
      Example: C

Data access without nested declarations of procedures

Storage allocation and access of variables

  • Global variables are assigned instatic area,usestatically determined addressaccess them
  • Other variables must beLocal variables active on the top of the stack. able to passThey are accessed by the top_sp pointer of the runtime stack.

Data access when there are nested declarations of procedures

Nesting depth

  • Nesting depth of procedures
    • A procedure that is not embedded in any other procedure, set its nesting depth to 1
    • If a procedure p is defined within a procedure with nesting depth i, set the nesting depth of p to i + 1
  • Nesting depth of variables
    • Use the nesting depth of the procedure in which the variable is declared as the nesting depth of the variable
      image

Access Links

Static scope rule: Procedure b can access objects declared in procedure a as long as the declaration of procedure b is nested within the declaration of procedure a.

You can establish a process calledAccess link pointer, so that the embedded procedure can access the object declared in the outer procedure

  • If procedure b is directly nested within procedure a in the source code (b's nesting depth is 1 more than a's nesting depth), then the access chain in any activity of b points to the nearest activity of a.

Establishment of access chain

The code that establishes the access chain belongs tocalling sequencea part of

Assume that a procedure x with a nesting depth of nx calls a procedure y(x->y) with a nesting depth of ny.

  • The case of nx < ny (the outer layer calls the inner layer)
    • y must be directly defined in x (for example: s->q, q->p), so ny=nx+1,
    • Add a step to the calling code sequence: place a pointer to the active record of x in the access chain of y
  • The case of nx = ny (this layer calls this layer)
    • Recursive call (for example: q->p)
    • The access chain of the callee's activity record is the same as the access chain of the caller's activity record and can be copied directly.
  • The case of nx > ny (the inner layer calls the outer layer, such as: p->e)
    • Process x must be nested in a certain process z, and z directly defines process y
    • Starting from the activity record of x, you can find the activity record of z closest to the top of the stack through nx-ny+1 steps along the access chain. The access chain of y must point to the activity record of z

Heap storage allocation

Heap storage allocation is to divide the continuous storage area into blocks, which are allocated when required by active records or other objects. The
release of blocks can be performed in any order, so after a period of time, the area may contain interleaved areas that are in use and have been released.
image

Apply

Assume that the total length of the current free block is M, and the length to be applied for is n.

  • If there are several storage blocks with a length greater than or equal to n, storage allocation can be performed according to one of the following strategies
    • Take the first free block with length m that meets the requirements, and put the remaining part with length mn in the free chain
    • Find the smallest free block of length m that meets the requirements
    • Find the largest free block with length m that meets the requirements
  • If there is no storage block with length greater than or equal to n
    • If M>=n, shift and reorganize the free blocks in the heap (all relevant parts need to be modified accordingly, which is a very complicated and difficult task)
    • If M<n, more complex strategies should be adopted to solve the heap management problem

freed

Just insert the released storage block into the free chain as a new free block and delete the corresponding record in the occupied block record table.

summary

In order to implement heap storage management, a large number of auxiliary operations must be completed. Such as sorting, table lookup, table filling, insertion, deletion,... Its space and time overhead are relatively large

References

"Principles of Compilation" Second Edition

Guess you like

Origin blog.csdn.net/u010523811/article/details/124859256