Advanced C language: in-depth analysis of data storage in memory

content

1. Introduction to data types

1.1 Basic classification of types

2. Shape storage in memory

2.1 Original code, inverse code, complement code

2.2 Introduction to Big and Little Endian

2.3 Exercises

3. Floating point storage in memory

3.1 An example of floating point storage: 

3.2 Floating point storage rules


1. Introduction to data types

We've learned about the basic built-in types and how much storage they take up

char //Character data type, occupying 1 byte

short //Short integer, occupying 2 bytes

int //shape, occupy 4 bytes

long //Long integer, occupying 4 bytes

long long //Longer integer, occupying 8 bytes

float //Single-precision floating-point number, occupying 4 bytes

double //Double-precision floating-point number, occupying 8 bytes

Type meaning:

1. Use this type to open up the size of the memory space (the size determines the range of use).

2. How to look at the perspective of memory space. (For example, int and float are both 4 bytes, one is an integer and the other is a decimal when looking at it)

1.1 Basic classification of types

Integer family:

char

        unsigned char

        signed char

short

        unsigned short [int], [int] can be omitted

        signed short [int], [int] can be omitted

int

        unsigned int

        signed int

long

        unsigned long [int], [int] can be omitted

        signed long [int], [int] can be omitted

int mian()
{
	char c = 'w'; //char到底是signed char 还是unsigned char是不确定的,取决于编译器的实现
	signed char c2 = 't';

	short int a = 10; //short短整型,int可以省略
	short b = 20; //short是signed short

	signed short c = 30;
	unsigned short d = 40;

	return 0;
}

 Floating point family:

float

double

Constructed type (custom type): 

> Array type

> Structure type struct

> enumeration type enum

> union type union

Pointer type:

int *pi;

char *pc;

float* pf;

void* pv; 

Empty type: 

void means empty type (no type)

Usually applied to the return type of the function (no return type), the parameter of the function (no parameter), the pointer type (null type pointer)

2. Shape storage in memory

The creation of a variable is to open up space in memory, and the size of the space is determined according to different types.

E.g:

int a = 3;
int b = -1;
int c = 0x11223344;

a, b, c are in memory:

Data stored in memory is binary

When VS displays memory, it displays hexadecimal data for convenience. One hexadecimal digit represents 4 binary digits, and two hexadecimal digits are 8 binary digits, that is, one byte (for a, 03 00 00 00 means four bytes)

2.1 Original code, inverse code, complement code

There are three representation methods for integers in the computer, namely original code, inverse code and complement code .

The three representation methods have two parts: the sign bit and the value bit. The sign bit uses 0 to represent "positive" and 1 to represent "negative", and the three representation methods of negative integers for the value bit are different. 

original code

It can be obtained by directly translating binary into binary in the form of positive and negative numbers.

one's complement

Keep the sign bit of the original code unchanged, and the other bits can be obtained by inverting the other bits in turn. 

complement

Complement +1 to get complement. (The original code can also be passed: the sign bit of the complement code remains unchanged, the value bit is inverted bit by bit, and then +1 is obtained)

 For integers, it can be divided into signed and unsigned numbers

Signed number: sign bit + value bit

Positive number: 0 + numeric digits

Negative numbers: 1 + numeric digits

//原码 - 有符号数,直接根据正负数值给出的二进制序列就是原码
//反码 - 原码的符号位不变,其他位按位取反
//补码 - 反码二进制的最低位+1得到

//正数的原码、反码、补码相同
int main()
{
	int a = 3;    //signed int a = 3;
	//00000000000000000000000000000011         - 原码
	//00000000000000000000000000000011         - 反码
	//0000 0000 0000 0000 0000 0000 0000 0011  - 补码
	//0    0    0    0    0    0    0    3     - 16进制(内存中)
	
	int b = -1;    //signed int b = -1;
	//10000000000000000000000000000001         - 原码
	//11111111111111111111111111111110         - 反码
	//1111 1111 1111 1111 1111 1111 1111 1111  - 补码
	//f    f    f    f    f    f    f    f     - 16进制(内存中)
   
	return 0;
}

 For hex bits 0 1 2 3 4 5 6 7 8 9 abcdef, f can be obtained from binary 1111, and -1 in memory is ff ff ff ff

in conclusion:

1. The original, inverse and complement of positive numbers are the same.

2. For integers: the data stored in the memory actually stores the complement code.

why?

In computer systems, values ​​are always represented and stored in two's complement numbers. The reason is that by using the complement code, the sign bit and the value field can be processed uniformly; at the same time, the addition and subtraction can also be processed uniformly (the CPU only has an adder) Additional hardware circuitry is required.

E.g:

int c = 1 - 1;
//CPU只有加法器
//1 - 1 => 1 + (-1),此时用原码来进行计算,结果是错误的(-2)
//用补码来计算则可获得正确结果(符号位也参与计算,进位后丢掉即可)

For unsigned numbers: same as positive integers.

For char(8bit):

for int c = 0x11223344;

There is a phenomenon of "reverse storage" in the memory, why?

2.2 Introduction to Big and Little Endian

What is big endian little endian:

Big-endian (storage) mode means that the low-order bits of the data are stored in the high address of the memory, and the high-order bits of the data are stored in the low address of the memory;

Little-endian (storage) mode means that the low-order bits of the data are stored in the low address of the memory, and the high-order bits of the data are stored in the high address of the memory.

for int c = 0x11223344; in memory:

Visible: the current compiler uses little-endian byte order storage

Classic interview questions:

Please briefly describe the concepts of big-endian byte order and little-endian byte order, and design a small program to determine the current machine's byte order. (10 points)

//写一个代码,判断当前机器使用的大端还是小端

//方法1
int check_sys()
{
	int a = 1;
	return (*(char*)&a);
}
int main()
{
	int a = 1;
	int ret = check_sys();
	if (ret == 1)
	{
		printf("小端\n");
	}
	else
	{
		printf("大端\n");
	}
	return 0;
}

//方法2
int main()
{
	int a = 1;
	char* p = (char*)&a;//对p解引用则访问一个字节 
	if (*p == 1)
	{
		printf("小端");
	}
	else
	{
		printf("大端");
	}
	//
	//0x 00 00 00 01
	//
	//低       高
	//小端  
	//01 00 00 00
	//大端
	//00 00 00 01

	//只需要看第一个字节是00还是01即可判断
	return 0;
}

2.3 Practice questions

What is the output of the following code?

//1.
//输出什么?
#include <stdio.h>
int main()
{
	char a = -1;
	//大部分编译器基本上都是signed char
	//11111111 - a
	//11111111111111111111111111111111 - 整形提升后(此时是补码)
	//10000000000000000000000000000001 - 原码  -1
	signed char b = -1;
	//11111111 - b 
	//计算过程和a一样 -1
	unsigned char c = -1;
	//11111111 - c
	//00000000000000000000000011111111 - 正数的原反补码都相同
	//十进制 : 255

	printf("a=%d,b=%d,c=%d", a, b, c);//a = -1, b = -1, c = 255
	//打印%d时会发生整形提升(按照符号位提升)
	return 0;
}
//2.
#include <stdio.h>
int main()
{
	char a = -128;
	//10000000000000000000000010000000 - -128原码
	//11111111111111111111111110000000 - -128补码
	//1000000                          - 存到a中
	//整形提升后:
	//11111111111111111111111110000000 - 4294967168 (2^32 - 127 - 1)
	
    //%u : 打印无符号整形
	printf("a = %u\n", a); //a = 4294967168
	return 0;
}
//3.
#include <stdio.h>
int main()
{
	char a = 128;
	//00000000000000000000000010000000 - 128原码
	//10000000                         - 存到a中
	//整形提升后
	//11111111111111111111111110000000 - 4294967168
	printf("a = %u\n", a); // a = 4294967168
	return 0;
}
//4.
int main()
{
	int i = -20;
	//10000000000000000000000000010100 - -20原码
	//11111111111111111111111111101100 - -20补码
	unsigned  int  j = 10;
	//00000000000000000000000000001010 -  10原(补)码
	
	//11111111111111111111111111110110 - i + j 补码
	//10000000000000000000000000001010 - i + j 原码: -10
	printf("i+j = %d\n", i + j); // i+j = -10
	//按照补码的形式进行运算,最后格式化成为有符号整数

    return 0;
}
//5.
int main()
{
	unsigned int i;//因为i定义的是无符号类型 则始终有:i >= 0; 

	for(i = 9; i >= 0; i--)
	{
		printf("%u\n", i); //死循环
		//当i = -1存入内存中(之后以此类推):
		//11111111111111111111111111111111 - 当做无符号数处理 - 一个巨大的正数
	}
	return 0;
}
//6.
int main()
{
	char a[1000];
	int i;
	for (i = 0; i < 1000; i++)
	{
		a[i] = -1 - i;
	}
	//a[i]从-1 -2 ... -128 共有128个数字
	//对于char来说 负数最多到 : -128
	//当存入-129时,对于char而言是放不下的
	//10000000000000000000000010000001 - -129原码
	//11111111111111111111111101111111 - -129补码
	//01111111                         - -129存入char中 - 127
	//以此类推,存入-130时,内存中实际存储的是126
	//127 126 ... 3 2 1 0,0之前共有127个数字(因为strlen遇到0就终止了)
	//127 + 128 = 255
	printf("%d", strlen(a));//255
	return 0;
}
//7.
#include <stdio.h>
unsigned char i = 0;
//对于无符号的char 取值范围是[0 , 255]
//对于下面代码,i <= 255恒成立,故死循环了
int main()
{
	for (i = 0; i <= 255; i++)
	{
		printf("hello world\n");//死循环
	}
	return 0;
}

 Note: For signed char, remember the following diagram, (unsigned char range: [ 0 , 255] ):

3. Floating point storage in memory

Common floating point numbers:

3.14159

1E10 (scientific notation, 1.0 * 10^10)

The floating-point family includes: float, double, long double types.

The range represented by floating-point numbers: defined in float.h (integer family: definition of the range of values ​​-> limits.h)

3.1 An example of floating point storage: 

int main()
{
	int n = 9;
	float* pFloat = (float*)&n; //把int*类型指针强制转换成float*类型
	printf("n的值为:%d\n", n); //9
	printf("*pFloat的值为:%f\n", *pFloat);//0.000000
	*pFloat = 9.0;
	printf("n的值为:%d\n", n); //1091567616
	printf("*pFloat的值为:%f\n", *pFloat); //9.000000
	
	return 0;
}

Description: The storage of floating-point numbers is different from that of integers. So how are floating point numbers stored?

3.2 Floating point storage rules

According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754, any binary floating-point number V can be represented in the following form:

(-1)^S * M * 2^E

(-1)^s represents the sign bit. When s=0, V is positive; when s=1, V is negative.

M represents a significant number, greater than or equal to 1 and less than 2 .

2^E means the exponent bit.

For 5.5:

IEEE 754 stipulates: For a 32-bit floating-point number, the highest 1 bit is the sign bit s, the next 8 bits are the exponent E, and the remaining 23 bits are the significand M:

E.g:

int main()
{
	float f = 5.5f;//如果不写f,则默认是double类型
	//101.1
	//1.011 * 2^2
	//(-1)^0 * 1.011 * 2^2
	//S = 0,E =10000001 (2 + 127),M = 011(后面再补20个0)
	//0100 0000 1011 0000 0000 0000 0000 0000 - 内存中存储的二进制
	//4    0    11   0    0    0    0    0    
	//40 B0 00 00 - 十六进制

	return 0;
}

f = 5.5f in memory:

 For 64-bit floating-point numbers, the highest 1 bit is the sign bit S, the next 11 bits are the exponent E, and the remaining 52 bits are the significand M

 

IEEE 754 also has some special provisions for the significand M and the exponent E:

As mentioned earlier, 1≤M<2, that is, M can be written in the form of 1.xxxxxx, where xxxxxx represents the fractional part. IEEE 754 stipulates that when M is stored in the computer, the first digit of this number is always 1 by default , so it can be discarded and only the following xxxxxx part is saved. For example, when saving 1.01, only 01 is saved , and when it is read, the first 1 is added . The purpose of this is to save 1 significant figure. Taking a 32-bit floating point number as an example, there are only 23 bits left for M. After rounding off the 1 in the first digit, 24 significant digits can be saved.

As for the index E, the situation is more complicated:

First, E is an unsigned integer (unsigned int),

This means that if E is 8 bits, its value range is 0~255; if E is 11 bits, its value range is 0~2047. However, we know that E in scientific notation can have negative numbers, so IEEE754 stipulates that the real value of E must be added with an intermediate number when stored in memory . For 8-bit E, this intermediate number is 127; for 11-bit E, the middle number is 1023 . For example, the E of 2^10 is 10, so when it is stored as a 32-bit floating point number, it must be stored as 10+127=137, which is 10001001.

Secondly, the index E is fetched from memory and can be further divided into three cases:

E is not all 0 or not all 1

At this time, the floating-point number is represented by the following rules, that is, the calculated value of the exponent E is subtracted from 127 (or 1023) to obtain the real value, and then the first 1 is added before the significant figure M.

For example: The binary form of 0.5 (1/2) is 0.1. Since the positive part must be 1, that is, the decimal point is shifted to the right by 1, then it is 1.0*2^(-1), and its order code is -1+127= 126 is represented as 01111110, and the mantissa 1.0 removes the integer part as 0, and fills in 0 to 23 bits 00000000000000000000000, then its binary representation is:

0 01111110 00000000000000000000000

E is all 0:

At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023), which is the real value.

The significant digit M is no longer added with the first 1 , but is reduced to a decimal of 0.xxxxxx . This is done to represent ±0, and very small numbers close to 0 . ​​​​​​​

E.g:

    //0 00000000 01000100101000000000000
	//E+127存入数据后是00000000
	//真实的E = -127
	//(-1)^0 * 1.01000100101 * 2^(-127) - 无限接近于0的数字
	
	//所以对于接近0的数字:
	//M拿出来不+1,E = -126(32位)
	//(-1)^0 * 0.01000100101 * 2^(-126) - 真实取出时的数字,也无限接近于0

E is all 1

At this time, if the significant digits M are all 0, it means ± infinity (positive or negative depends on the sign bit s)

	//E为全1
	//E + 128 = 255
	//E = 127
	//(+ -) * 1.xxxxxx * 2 ^ 128 - 趋近于正负无限大

About the representation rules of floating-point numbers, we will talk about it here.

With the above foundation: Let's go back to the question at the beginning: why is 0x00000009 restored to a floating point number, it becomes 0.000000?

int main()
{
	int n = 9;
	//00000000000000000000000000001001 - 二进制

	float* pFloat = (float*)&n; //把int*类型指针强制转换成float*类型
	printf("n的值为:%d\n", n); //9
	
	printf("*pFloat的值为:%f\n", *pFloat);//0.000000
	//*pFloat - 以浮点数的视角去访问n的四个字节,就会认为n的4个字节中放的是浮点数
	//0 00000000 00000000000000000001001 (E全0的情况)
	//(-1)^0 * 2 ^ (-126) * 0.00000000000000000001001
	//0.000000
	
	*pFloat = 9.0;
	//*pFloat - 以浮点数的视角观察n的4个字节
	//以浮点数的形式存储9.0
	//1001.0 - 二进制
	//1.001 * 2^3 - 科学计数法
	//(-1)^0 * 1.001 * 2^3
	//S = 0,E = 130(3 + 127). M = 00100000000000000000000 
	//0 10000010 00100000000000000000000 - 内存中存储形式
	//1091567616 - 二进制
	printf("n的值为:%d\n", n); //1091567616
	printf("*pFloat的值为:%f\n", *pFloat); //9.000000

	return 0;
}

Guess you like

Origin blog.csdn.net/m0_62934529/article/details/123702995