High-level C language | Deep dissection data is stored in memory

1. Data type introduction

char //Character data type
Numerical length: -128~127 (-2 to the 7th power of 2-1)
shuort //Short integer
numerical length: -32768~32767 (-2 to the 15th power 2 to the 15th power-1)
int //Integer
value length: -2147483648~2147483647 (-2 to the 31st power of 2-1)
long //Long
integer value length: (-9223372036854774808~9223372036854774807) (63rd power of -2 to 63rd power of 2 -1)
float //Single-precision floating-point
value length: (1.401298e-45~3.402823e+38) (e-45 is negative 45 times multiplied by 10 square, e+38 is multiplied by 10 to the 38th power) (2 to the 149th power to 2 to the 128th power to -1)
double //Double-precision floating-point
number value length: (4.9000000e-324 ~ 1.797693e+308 ) (2 to the -1074th power, 2 to the 1024th power)

Significance of the type:
1. Use this type to open up the size of the memory space (the size determines the scope of application).
2. How to look at the perspective of memory space.

1.1 Basic classification of types

Plastic family:

char
unsigned char
signed char
short
unsigned short [int]
signed short [int]
int
unsinged int
signed int
long
unsigned long [int]
signed long [int]

The C language standard does not clearly stipulate whether char is a signed char type or an unsigned char type. On compilers such as VS2019 VS2022, the char type is signed char type

Floating point family:

float
double

Constructed type: (custom type)

Array type
Structure type struct
enumeration type enum
joint type union

Pointer type:

int *pi;
char *pc;
float *pf;
void *pv;

Empty type:
void represents the empty type (no type)
and is usually applied to the return type of the function, the parameter of the function, and the pointer type.

2. Storage of shaping in memory

2.1 Original code, inverse code, complement code

There are three binary representation methods for integers in computers, namely, original code, inverse code, and complement code. The three representation methods all have two parts: a sign bit and a value bit. The sign bit uses 0 to indicate "positive" and 1 to indicate "negative". ", and the original, inverse, and complement codes of positive numbers are the same.
There are three different ways of representing negative integers.

Original code
The original code can be obtained by directly translating the value into binary in the form of positive and negative numbers.
Inverse code
Keep the sign bit of the original code unchanged, and invert the other bits in turn to get the inverse code.
Complement code
Complement code is obtained by inverse code +1.

For shaping: the data is actually stored in the complement code.
In the computer system, all values ​​are represented and stored in the complement code. The reason is that, using the complement code, the sign bit and the value field can be processed uniformly;
at the same time, addition and subtraction can also be processed uniformly (the CPU only has an adder). In addition, the operation process of the complement code and the original code is the same, not Additional hardware circuitry is required.
Check the storage of data in memory, the data is expressed in hexadecimal form,
insert image description here
we can see that a and b are stored in complement codes respectively. But its storage order is a bit wrong. Our normal understanding of a in hexadecimal is 00 00 00 14, and b is ff ff ff ff f6, so why? see the analysis below

2.2 Big and small endian introduction

The big-endian storage mode means that the low bits of the data are stored in the high address of the memory, and the high bits of the data are stored in the low address of the memory; the little-endian storage mode means that the low bits of the data are stored in the low address of the memory, while the high bits of the data are stored in the low address of the memory
. The high bit of the data is stored in the high address of the memory.

The so-called low-order high-order is the low-order high-order in mathematics, such as ones, tens, etc., and bytes are used as the low-order and high-order in mathematics in memory. Since the integer is stored in memory, it occupies 4 bytes. Each memory unit Occupies 1 byte, and 1 hexadecimal is converted from 4 binary digits, so 14 in the above figure is two hexadecimal digits, obtained from 8 binary digits, that is, 1 byte, and The address is from small to large. If it is stored in little endian form, 14 is placed at the low address.
insert image description here
Why there is big endian and little endian:

In the computer system, we use bytes as the unit, and each address unit corresponds to a byte, and a byte is 8bit. But in C language, in addition to 8bit char, there are also 16bit short type and 32bit long type (depending on the specific compiler). For 32-bit or 32-bit processors, since the width of the register is greater than one byte, there must be a problem of how to arrange multiple bytes. **Therefore, it leads to big-endian storage mode and little-endian storage mode.
Little endian mode: commonly used x86 structure, ARM, DSP, etc.
Big endian mode: KEIL C51, etc.

Baidu 2015 system engineer written test questions:
determine whether the machine is big-endian or little-endian

#include <stdio.h>
int check_sys()
{
    
    
	int i = 1;
	return *(char*) &i;//取出i的地址并转换成char*,解引用时只能访问第一个字节的内容,只要判断第一个字节的内容是00还是01
}
int main()
{
    
    
	int ret = check_sys();
	if (ret == 1)
	{
    
    
		printf("小端\n");
	}
	else
	{
    
    
		printf("大端\n");
	}
	return 0;
}

Big endian if the low byte (hexadecimal) is at the high address, otherwise little endian
insert image description here

insert image description here
The result shows little endian, indicating that the x64 environment is a little endian storage mode.

2.3 Exercises

#include <stdio.h>
int main()
{
    
    
	char a = -1;
	signed char b = -1;
	unsigned char c = -1;
	pirntf("a=%d,b=%d,c=%d", a, b, c);
	return 0;
}

-1 is an integer stored in a variable of char type, truncation will occur, and the truncation size is one byte. When a variable of char type is printed in signed decimal, integer promotion will occur, and for signed char type, When the plastic is promoted, the highest bit is supplemented with a sign bit, and the final result is the same as the original one, the result is -1. For the unsigned plastic promotion, the highest bit is filled with 0, and the result is the same as the result of a positive integer. From the above results can be
insert image description here
insert image description here
obtained When performing unsigned integer promotion, the original code, inverse code, and complement code are the same, which is equivalent to a positive integer.

#include <stdio.h>
#include <windows.h>
int main()
{
    
    
	unsigned int i;
	for (i = 9; i >= 0; i--)
	{
    
    
	    Sleep(1000);//设置每次延迟1000mms打印
		printf("%u\n", i);
	}
	return 0;
}

The code will normally output 9 to 0 first, but when i is -1 and stored in an unsigned integer variable, the highest bit of the complement of -1 is not a sign bit, but a valid bit, and it will be It is a very large number, and then it has been reduced to 0, and then to -1, forming an endless loop
insert image description here

#include <stdio.h>
int main()
{
    
    
	char a[1000];
	int i;
	for (i = 0; i < 1000; i++)
	{
    
    
		a[i] = -1 - i;
	}
	printf("%d", strlen(a));
	return 0;
}

This array is a signed char array, its range is -128~127, when i is 127, arr[127]==-128, when i is 128, arr[128]==127, until arr[ i]==0, see the figure below.
insert image description here
Through this figure, it is found that for a signed char, the size of 8 bits is loaded, when -1 to 0, an extra 1 overflow will be entered, and it will return to 0, and the cycle continues like this , so its range is -128~127, which can be represented by a circle (see the figure on the right), and the above code, when arr[i]==0, and because the strlen function will stop reading when encountering 0, so The final printed length is 128+127=255
insert image description here

#include <stdio.h>
unsigned char i = 0;
int main()
{
    
    
	for (i = 0; i <= 255; i++)
	{
    
    
		printf("hello world\n");
	}
	return 0;
}

This variable i is of unsigned char type, ranging from 0 to 255. Compared with signed char, each bit of unsigned is a valid bit. See the figure below, which is similar to
insert image description here
signed char. If 255 is added to 1 again, it will overflow , so that the loop continues, and the printed result is, 0 to 255 hello world, and then continue to print 0 to 255 hello world, and the loop continues.
insert image description here

3. Storage of floating-point types in memory

Common floating-point numbers:
3.14159
1E10==1.0*10^10
The floating-point number family includes: float, double, long double types.
The range of floating-point numbers: import the header file float.h

3.1 Floating-point number storage rules

According to the international standard IEEE (Institute of Electronics and Electronics Engineering) 754, any binary floating-point number V can be expressed in the following form:

  • (-1)^S* M * 2 ^ E
  • (-1)^S represents the sign bit, when S=0, V is a positive number; when S=1, V is a negative number.
  • M represents a valid number, greater than or equal to 1, less than 2

Example:

5.0 in decimal is 101.0 in binary, and 1.01x2^2 in scientific notation in binary.
Then, according to the format of V above, it can be concluded that S=0, M=1.01, and E=2.

IEEE 754 stipulates:
For 32-bit floating-point numbers, the highest 1 bit is the sign bit S, the next 8 bits are the exponent E, and the remaining 23 bits are valid figures M. For 64-bit floating-point numbers, the highest 1 bit
insert image description here
is The sign bit S, the next 11 bits are the exponent E, and the remaining 52 bits are the significant figure M.
insert image description here
IEEE 754 has a special provision for the effective number M and the exponent E.

As mentioned earlier, M>=1 && m<2 , that is to say, M can be written in the form of 1.xxxxxx, where 1.xxxxxxx represents the decimal part.

IEEE 754 stipulates that when M is saved inside the computer, the first digit of this number is always 1 by default, so it can be discarded, and only the following xxxxxxx part is saved. For example, when saving 1.01, only save 01, and then add the first 1 when reading. The purpose of doing this is to save 1 significant figure. Taking the 32-bit floating-point number as an example, there are only 23 bits left for M. After the first 1 is rounded off, 24 significant figures can be saved.

As for the index E, the situation is more complicated.
First, E is an unsigned integer (unsigned int)
, which means that if E is 8 bits, its value range is 0~255; if E is 11 bits, its value range is 0~2047. However, we know that E in scientific notation can have negative numbers, so IEEE 754 stipulates that an intermediate number must be added to the real value of E when stored in memory. For 8-digit E, the intermediate number is 127; For 11-bit E, the middle number is 1023. For example, the E of 2^10 is 10, so when saving it as a 32 floating-point number, it must be saved as 10+127=137, which is 10001001.

Then, the index E is fetched from the memory and can be further divided into three cases:

E is not all 0 or not all 1.
At this time, the floating point number is represented by the following rules, that is, the calculated value of the exponent E is subtracted by 127 (or 1023) to obtain the real value, and then the effective number M is added before the first digit. 1.
For example:
the binary form of 0.5 (1/2) is 0.1, since the positive part must be 1, that is, the decimal point is shifted to the right by 1, then it is 1.0*2^(-1), and its order code is -1+127= 126, expressed as 01111110, and the mantissa 1.0 removes the integer part to be 0, fills 0 to 23 digits 0000000000000000000000, then its binary digits: 0 01111110 00000000000000000000000

When E is all 0
, the exponent E of the floating-point number is equal to 1-127 (or 1-1023), which is the real value, and the effective number M is no longer added with the first digit of 1, but is restored to a decimal of 0.xxxxxxx. This is done to represent ±0, and very small numbers close to 0.

E is all 1.
At this time, if the effective number M is all 0, which means ± infinity (positive or negative depends on the sign bit S), then the exponent E is 255-127=128, which will be a very large number

3.2 Examples

#include <stdio.h>
int main()
{
    
    
	int n = 9;
	float* pFloat = (float*)&n;
	printf("n的值为:%d\n", n);
	printf("*pFloat的值为:%f\n", *pFloat);
	//00000000000000000000000000001001  --9
    //S=0
    //M=0.0000000000000000001001
    //E=1-127=-126

	*pFloat = 9.0;
	// 1001.0==(-1)^0*1.001*2^3
    //S=0
	//M=1.001
	//E=3  
	// 3+127==130  10000010
	//0 10000010 00100000000000000000000
	printf("num的值为:%d\n", n);
	printf("*pFloat的值为:%f\n", *pFloat);
	return 0;
}

In the above code, the binary digit of 9 is 00000000000000000000000000001001, which will be a very small number when printed as a floating point number, but will retain 6 decimal places when printed in the form of %f. When 9.0 is printed as an integer, it will be directly stored in the memory as a floating point
number The number in (0 10000010 00100000000000000000000) is printed as a positive integer and stored in the memory. After conversion, the result will be a very large number
insert image description here
end~

Guess you like

Origin blog.csdn.net/weixin_68201503/article/details/130989437