A basic understanding of high-level programming language type system

Type System

In my "basic understanding of high-level programming languages" in an article I once explained his understanding of high-level programming language, which it said the importance of data types, and many a high-level programming language or even most characteristics because of the type of system and decision. The purpose of this article is to chat with my understanding of high-level programming language type system, by type of system will be able to learn some of the basic characteristics of a high-level programming language.

Therefore, to establish a kind of "type thinking" in learning a high-level programming language is very necessary.

We describe the type of system a language, they usually have 强类型, 弱类型, 静态类型and 动态类型four terms. For these four terms, explain the Internet to find are very messy, and there are a lot of explanation is obviously wrong. For instance, some users in explaining why the Python language is defined as a strongly typed language, the use of the following example:

a = 1
b = "2"
c = a + b //error
复制代码

However, the same as a strongly typed language Java, the string does not complain and digital addition, this above-described explanation is not accurate.

In this article, I will talk about my understanding of these concepts, and introduces some of my own concept to perfect my "type of thinking."

Static type and dynamic type

In the "basic understanding of high-level programming languages" have said the significance of the data types in high-level programming language, a high-level programming languages typically define a variety of basic data types, and these types of data actually represent a fixed memory length and handling of data in the corresponding memory.

Here, we also need to know a little bit operating mechanism program, the program at run time need to have a base address, and CPU data and instructions addressed by the base address + offset mechanism, so that the program ran (of course, this is a relatively simple summary, the actual situation is very complicated).

And we know that high-level programming language must go through a conversion to run on your computer. In general high-level programming language can be roughly divided into two categories - type interpreted and compiled execution type (of course, the actual situation will be more complicated).

By following the above analysis, we have to clear two concepts, that is, 静态and 动态. Generally we call 解释/编译stage 静态阶段, from 运行stage 动态阶段.

Static type

With the above understanding, the so-called 静态类型refers to the types (offset of the variable) variable will be finalized in the static phase, which means that it is possible to determine the offset of each variable in a static stage of the proceedings how many. This means that statically typed languages need to have these following features:

  • Variables must be used after the first statement.
  • Declare a variable when either explicitly specify a type or initialize rely on type inference to determine a type, and then determine the type of the variable will soon no longer be changed.
  • Data type definitions to relatively small particle size, such as not only a numeric type number, should define the corresponding integer, floating point, etc., different offset values ​​will be supported

Static type checking

Here I need to introduce the concept of a custom that static type checking , static type checking what is it? Means that the program compiler or interpreter to detect the type of each variable will have a static phase, and does not allow the same variable has hold different types of values, if you want to keep, or compatible types, or the conversion value is stored type into a variable. Meanwhile, parameters for operand operator, corresponding to the type of function will do type checking [1]

Why introduce this term static type checking, because the static type checking is a must for statically typed language, because statically typed languages need to determine the offset variable in a static stage, then the type of the variable must be invariant, otherwise it will not be able to determine the offset of a variable in a static stage .

And although some languages ​​are dynamically typed, but it provides the functionality of static type checking, such as TypeScript language is the tool's properties.

Dynamic type

Dynamic type will be more flexible than statically typed language in coding, because no variable in a static stage to determine the type, type (offset) of a variable is determined only in the operational phase. This means above the static type has those characteristics of dynamic typing is not required.

Dynamic type checking (type detection runtime (RTTI))

Here, I need to introduce the notion that the dynamic type checking, and can also be a run-time type checking (RTTI), whether it is a dynamically typed language, or statically typed language, RTTI is necessary because sometimes the data type conversion is inevitable of.

However, it notes that, RTTI type of detection is more the type of the detected value, rather than the type of a variable, while strongly typed language for dynamic detection of all types occur in a dynamic stage

Static type and dynamic type of contrast

1565520846800

The above JavaScript is dynamically typed languages, but C ++ is a statically typed language, the above-mentioned two pieces of code to achieve the same function, we analyze the difference between dynamic and static types in terms of the type of memory allocation

1565521806470

You can see, JavaScript needs to dynamically calculate the offset of each attribute at runtime, and each object must maintain their properties offset information, and the C ++ compiler when it has been determined that the good of each object offset property, offset only need to save a copy of the information on it.

At the same time, the static type of sub-document can be achieved, because statically typed language will certainly be characteristic of static type checking, with IDE, write code that will be more convenient.

Strongly typed and weakly typed

For strongly typed and weakly typed these two concepts do not have a clear definition, and for these two definitions, the Internet is relatively messy explanation, here I give according to their own understanding of a self that more mundane explanation.

First, a programming language, the most basic part is the type of system operators, keywords and phrases, with these most basic part, we can implement any complex procedures. At the same time, we eventually manipulate data are carried out through these basic programming language as part of our definition, I am here to discuss strongly typed and weakly typed mainly for the operator to carry out.

We all know the programming language in the definition of an operator when the number of operands will define the operator and the operator expects to receive the data type of the operand, then we will discuss below.

Strongly-typed

Type strongly manifested characterized in that, when the operator does not match the number of the operator if the received data types and data types specified in the definition of the operator, will directly report exceptions, the static type of language or a static type having check the dynamically typed language would report in a static phase error, rather than a static type checking of dynamically typed languages ​​will report errors at run time. This time comparing mandatory requirement for programming languages ​​or data type is more sensitive to the type of data, which we call strongly typed. Typically there is a strongly typed language Java, Python and so on. Several examples are given below in Python operator defined:

The binary arithmetic operators follow the traditional priority. Note that some of these specific operator is also applied to the non-numeric type ( for non-numeric type supports operator gives a specific description ). Except exponentiation operator only two priority levels, a multiplication type operator acting on another operator acting on adding type:

Operators *(multiplication) product of the outputs parameters. Or two parameters must both be digital, or a parameter and another parameter must be an integer sequence. In the former case, two numbers will be converted and then multiplied by the same type. In the latter case, a repetitive sequence; repetition factor is negative outputs an empty sequence.

...

Operator +(Addition) and outputs the parameters. Or two parameters must both be digital, or both may be the same type of sequence. In the former case, two numbers will be summed and then converted to the same type. In the latter case, the splicing operation execution sequence.

Operator -(minus) outputs the difference parameter. Two numeric parameters will first be converted to the same type ( because for numeric types, only one kind of number, it is possible that there will be at run-time type of integer and floating point division ).

This is the official Python documentation for 二元+defining operators, which explains why 1 + "2" this expression does not work in Python. Also note that the following code:

a = "123"
b = 1
print(a/b)
复制代码
Traceback (most recent call last):
  File "H:/Python/Hello/src/Hello.py", line 3, in <module>
    print(a/b)
TypeError: unsupported operand type(s) for /: 'str' and 'int'
复制代码

You can see the report anomalies in a str type and the type to do a number divisible by operating time.

Weakly typed

Weak type showed higher tolerance for the type of data operands received if the operator types and definitions when operators are not consistent defined data types, the abnormality does not directly reported but rather received first try value is converted to the data type of the value of the operator desired and then use the operator calculates, if the data type conversion failure, and ultimately will throw an exception. This time we call this type of language requirements for weak, or is not sensitive to type.

Such data type conversion will generally be in the running, of course, we can also be cast to a value in the code.

Such as JavaScript, filled with a lot of implicit type conversion, but also in the ES specification defines a very large number of abstract operations to provide operations for these type conversion implicitly, more about JavaScript in this part of the presentation will be part of JavaScript .

Strongly typed and weakly typed Comparative

  • Strongly typed language program robustness and readability are better, while using weakly typed language operators more flexible, but requires a strong ability to control.

C language doubts

C language is classified as a static weakly typed language, as I discussed above, it also can be said of the pass, the following code:

int main() {
    int a = 0;
    double b = 0;
    a = "123" - 3; //没有报错,但是产生的行为比较怪异,没有仔细研究过,但是足以说明C语言是一门弱类型语言
    printf("sizeof a: %d", sizeof(a)); //4 数据类型(偏移量)没有发生变化,所以是静态类型
    printf("\n");
    printf("valueof a: %d", a);
    printf("\n");
    printf("sizeof b: %d", sizeof(b)); //8
}
复制代码

True value and false value

Boolean is a relatively high frequency of use of any type of data in a high-level programming languages, because the results of any determination logic is represented by the value of the Boolean type. For coding convenience, many operating language types have introduced 真值and 假值concepts.

When a value is not converted to type bool bool type if it is true, we'll call it 真值, on the contrary, if a value is not of type bool to bool type value after the conversion value false, then we call this value 假值. This is equivalent to both true and false with the equivalent value in any data type, and we determined that for the logic can be time-dependent properties of the true value and a false value.

Meanwhile, in a dynamically typed language, introduced truth value logical operator will have a more powerful.

to sum up

After the above is my basic understanding of high-level programming language type system, with these basic understanding, to learn a new language to find out when the target language, and can save a lot of time.

More than purely personal point of view, there are different points of view are welcome for discussion.


  1. There are a lot of people might think this is a strongly typed demonstrated by the features, but by my analysis of multi-lingual found that this feature crown biased in strongly typed languages such as Python is a dynamic strongly typed language, but the value of a variable can have different types of storage. ↩︎

Guess you like

Origin juejin.im/post/5d5050e6e51d4562165534c1