"C Pitfalls and Defects" - Chapter 4 The Linker

4.1 What is a linker

A typical linker combines several object modules generated by a compiler or assembler into an entity called a load module or executable - which can be executed directly by the operating system.

screenshotThe linker usually sees an object module as a set of external objects. Each external object represents some part of the machine's memory and is identified by an external name. Therefore, every function and every external variable in the == program, if not declared static, is an external object. == Some C compilers will make some changes to the names of static functions and static variables, making them also external objects. Because of their "name mangling", they do not have name collisions with functions or variables of the same name in other source files.

4.2 Declarations and Definitions

extern int a;

The above code is not a definition of a, but shows that a is an external integer variable.

Note: After the introduction, if the imported location is outside the function, it is equivalent to defining a global variable at that location, and also follows the principle of local variable priority. If the imported location is within a function, it is equivalent to a local variable, and its function Domains are similar to local variables defined there, and it doesn't make sense to discuss declaration cycles here.

int a;
extern int a;

The above two statements can either be in the same source file or in different source files of the program.

== Note: Each external variable can only be defined once. == If multiple definitions of an external variable each specify an initial value, for example:

int a = 7;

appears in a source file, while

int a = 9;

appears in another source file and most systems will refuse to accept the program. However, if an external variable is defined in multiple source files without specifying an initial value, some systems will accept the program and others will not. ** So, each external variable must be defined only once.

4.3 Naming conflicts

4.3.1 Naming conflicts

If the definition is included in two different source files

int a;

Then it's either a program error (if the linker's external variable is named repeatedly), or the same instance of a is shared in both source files (regardless of whether the external variable in the two source files should be shared or not). Even if one of the definitions of a is present in a system-provided library file, the same processing is still performed.

4.3.2 static modifier

static int a;

After modifying a statically, the scope of a will be limited to a source file. For other source files, a is invisible and can no longer be referenced by extern. Of course, static also applies to functions. After using static, we can define a variable or function with the same name as the static-modified variable in other source files.

4.4 Formal parameters, actual parameters, return value

If the function we use is not declared, but has been defined later, the default function return type is int, which will cause extremely serious consequences.

If the function used is not defined before use or may be in another file, then it must be declared. The purpose of the function declaration is to inform the compiler of the type of the return value of the function.

Note: If a function has no parameters of type float, short, or char, the description of the parameter type can be completely omitted in the function declaration (note that the description of the parameter type cannot be omitted in the function definition). This approach relies on the caller being able to provide the correct number of arguments of the appropriate type. Here, "appropriate" does not mean "equivalent": parameters of type float are automatically converted to type double, and parameters of type short or char are automatically converted to type int.

Before the release of the ANSI C standard, there was often the following way of declaring and defining functions:

int isvowel();//声明函数的方式
int isvowel(c)
		char c;
{
	return c =='a' ;
}

In fact, the above writing is equivalent to the following writing:

int isvowel(int i)
{
	char c;
	return c=='a';
}

Both of the above methods are supported in VS2019.

See the example below:

#include<stdio.h>
int main()
{
	int i;
	char c;
	for (i = 0; i < 5; i++)
	{
		scanf("%d", &c);
		printf("%d ", i);
	}
	printf("\n");
	return 0;
}

On the surface, this program reads five numbers from the standard input device and writes five numbers to the standard output device:

0 1 2 3 4

Actually, this program does not necessarily get the above result. For example, on a certain compiler, its output is (of course, in VS2019 environment the program will crash because the memory space is illegally modified)

0 0 0 0 0 1 2 3 4

why? The crux of the problem is that here c is declared as type char, not type int. If the program requires scanf to read an integer, it should be passed a pointer to an integer. In the program, the scanf function gets a pointer to a character. The scanf function cannot distinguish this situation. It just accepts the pointer to the character as a pointer to an integer, and stores an integer at the position pointed to by the pointer. Because the storage space occupied by integers is larger than the storage space occupied by characters, the memory near the character c is overwritten.

What is stored in memory near the character c is determined by the compiler, which in this case holds the low-end part of the integer i. Therefore, every time a value is read into c, the low-end part of i will be overwritten to 0, and the high-end part of i is originally 0, which is equivalent to every time i is reset to 0, and the loop will continue. When the end of the file is reached, the scanf function no longer tries to read new values ​​into c. At this time, i can run normally, and finally terminate the loop.

4.5 Checking External Types

Note: All external definitions of a particular type are guaranteed to have the same type in every target module, and "same type" should also be strictly the same.

For example, to include definitions in one file:

char filename[] = "/etc/passwd";

And in another file include the declaration:

extern char *filename;

When defined, filename is the name of a character array. Although referencing the value of filename in a statement results in a pointer to the starting element of the array, filename is of type "character array", not a character pointer. In the second declaration, filename is identified as a pointer. The two ways of using storage space are different, and they cannot coexist in a sensible way. The memory layout of the first example character array filename is shown below:
image-20220304221554347

The memory layout of the character pointer filename in the second method is shown in the following figure:

image-20220304221842175

The modification method is shown in the following figure:

char filename[] = "/etc/passwd";
extern char filename[];

It can also be modified like this:

char*filename = "/etc/passwd";
extern char *filename;

4.6 Header files

Note: Each external object is only declared in one place, the place of this declaration is generally in the header file, and all modules that need to use the external object should also be included in this header file. In particular,Modules that define this external object should also include this header file.

Guess you like

Origin blog.csdn.net/m0_57304511/article/details/123596112