Review Notes on Compilation Principles of University of Electronic Science and Technology of China (4): Programming Language Design

Table of contents

foreword

Highlights

definition of language

Comparing: Generating Opinions Versus Identifying Opinions

How should semantics be described?

symbol string

set of strings

⭐Grammar (Super Emphasis)

definition

composition

express

⭐Classification (key) 

language produced by grammar

⭐phrases, direct phrases and handles (seeking them for grammatical analysis)

⭐Syntax tree (derivation tree)

language design

Exercises for this chapter

chapter summary


foreword

This review note is based on Mr. Zhang's classroom PPT, for my final review and reference for my classmates.


Highlights

This chapter begins to officially enter the focus of the compilation principle, the grammar part. 


definition of language

  • Language = syntax (rules) + semantics (rules)
  • Grammar: A collection of rules used to structure programs and their components (grammatical units)
  • Semantics: the set of rules that specify the meaning of a grammatically correct grammatical unit
  • A programming language is a formal representation used to describe the algorithms executed by a computer
  • Lexical rules: specify what sequences of characters can constitute valid symbols for a language
  • Grammatical rules: determine whether a sequence of symbols is a sentence, and provide sentence structure (judging whether it is legal)
  • Generated point of view: each line is called a grammatical rule, and grammar is the collection of all sentences, which can form a grammar
  • Recognition point of view: use syntax diagrams to define a given language, and each non-terminal symbol together with the corresponding production corresponds to a syntax diagram

Recognize the language under the viewpoint: the set of all terminal sequences that can be recognized by the syntax diagram

Comparing: Generating Opinions Versus Identifying Opinions

How should semantics be described?

So far there is no typical description tool, and many languages ​​still use natural language to describe semantics.

symbol string

The concept of symbol string needs to be introduced, and its later knowledge is a foundation, otherwise it may not even understand the question.

basic concept

  • Suppose x and y are symbol strings, x is abc, y is edf, then xy is abcdef, this behavior is called the connection of symbol strings;
  • When the symbol string z is kk...kk, it is recorded as z=k^n, that is, k is repeated n times
  • In particular, when n=0, z=ε(empty ship, no value sign); when k=pq, z=pqpq...pqpq

set of strings

To illustrate briefly with an example:

In particular, A^0={ε}, and the connection of string sets must have order.


⭐Grammar (Super Emphasis)

definition

A grammar is the formal rules that describe the grammatical structure of a language

An Important Property of Grammars: Finite Rules Describe Infinite Languages 

composition

Obviously, the grammar includes four elements : terminal, non-terminal, start symbol, and production . We agree to use uppercase letters to represent non-terminal symbols, lowercase letters to represent terminal symbols, and Greek small letters to represent strings. 

Productions can be abbreviated. When the left side is the same, the right side can be separated by " | " symbol, which means or.

express

To describe the grammar, just give the set of productions directly.

Example: Arithmetic expression grammar G0: (the second item of the first expression should be ET, the teacher made a mistake)

⭐Classification (key) 

① Type 0 grammar (phrase grammar)

Productions are of the form α→β, where α is a string containing at least one nonterminal, and β is any string.

② Type 1 grammar (context-sensitive grammar)

Type 1 grammar imposes restrictions on type 0, requiring that the length of the left side must be smaller than the length of the right side, that is, there must be non-terminal symbols to be replaced into strings, and the production form is αAβ→αγβ.

It is essentially A→γ, but there are restrictions on the front and rear α and β, so it can only be replaced between these two strings (context environment)

Type 2 grammar (context-free grammar)

The production is in the form of A→α. It is easy to see that type 2 grammar removes the restriction of the context. As long as there is a non-terminal on the left, the string can be replaced. Since this kind of grammar can basically describe the program, the context-free grammar is often grammar for short

④ Type 3 grammar (regular grammar/right linear grammar)

The production formula is in the form of A→α or A→αB (A and B are non-terminal symbols, and α is defined as a terminal string), that is, after the replacement, the left side must be locked by the terminal symbol, and only the right side can be changed.

language produced by grammar

Derivation and Specification

  • Direct derivation: directly replace the left side of the production with the right side of the production
  • Derivation: , double arrows with an asterisk indicate that the string on the right is obtained after more than 0 derivations, and double arrows with a plus sign indicate that the string on the right is obtained after at least one derivation
  • Reduction: the inverse of derivation 
  • Left/right derivation: replace the leftmost/right non-terminal every time, the rightmost derivation is also called canonical derivation

Sentence patterns and sentences

  • A string derived from a grammar start symbol contains non-terminals, which is a sentence; if the string contains only terminals, it is a sentence
  • A sentence pattern containing only terminal symbols is a sentence
  • A sentence must be a sentence pattern, and a sentence pattern is not necessarily a sentence
  • The set of all sentences produced by the grammar G is called the language L(G) produced by the grammar
  • Two grammars are said to be equivalent if they produce the same language

Example question : Seeking language (short answer questions)

 

 

⭐phrases, direct phrases and handles (seeking them for grammatical analysis)

phrase

 

direct phrase

 

 

The above precautions can be used to judge whether a phrase is a direct phrase, and only need to find the right-hand side of all productions. 

Understanding Phrases vs. Direct Phrases

 

the handle 

The leftmost direct phrase of a sentence pattern is called the handle of the sentence pattern. Examples are as follows:

prime phrase 

A phrase containing a terminator, and its proper substring does not have this property (a substring of a prime phrase is not a phrase with a terminator)

⭐Syntax tree (derivation tree)

  • The grammar tree uses a graph to represent the derivation process. The essence of the grammar tree is the derivation process, so each sentence pattern (sentence) of the grammar has a corresponding grammar tree
  • The edge of the derivation tree (the left-to-right connection of all the leaf nodes of the syntax tree) is the sentence type (sentence)
  • When a sentence has two different syntax trees, we say that the grammar is ambiguous
  • We can also determine phrases/direct phrases/handles through the derivation tree. The subtrees and simple subtrees of the syntax tree correspond to phrases and direct phrases, and the leftmost simple subtree is naturally the handle.
  • If the grammar tree has n internal nodes, there are n subtrees, and there are n phrases; m direct subtrees (only two generations of father and son) have m direct phrases.

Example :

 


language design

It is reflected in the experiment and will not be investigated.


Exercises for this chapter

 

 


chapter summary

Phrases, direct phrases and handles, prime phrases

Syntax tree concept, connection to phrases, drawing a syntax tree 

Guess you like

Origin blog.csdn.net/m0_59180666/article/details/130876999
Recommended