[C Advanced] In-depth analysis of data storage in memory

Table of contents

1. Introduction of data types

1. The meaning of the type:

2. Basic classification of types

2. Storage of shaping in memory

1. Inverse code complement of original code

2. Big and small endian introduction

3. Practice

3. Storage of floating-point types in memory

1. An example

2. Floating-point number storage rules

1. Introduction of data types

Earlier we have learned the basic built-in types and the size of the storage space they occupy:

char //Character data type

short //short integer

int // shaping

long //long shaping

long long //longer shaping

float //single-precision floating-point type

double // double precision floating point type

1. The meaning of the type:

1. Use this type to open up the size of the memory space (the size determines the range of use)

2. Determine how to look at the perspective of memory space:

The int and float types are both 4 bytes, but one is an integer and the other is a floating point type, and the perspective of viewing memory space is different

2. Basic classification of types

(1) Plastic surgery family:

char：

unsigned char

signed char

short :

unsigned short [int] //short integer, this int integer can be omitted

signed short [int]

int :

unsigned int

signed int

long :

unsigned long [int]

signed long [int]

[Reminder]: The char type is also the reason for the shaping family:

The character is stored in the memory as the ASCII code value (0-127) of the character , and the ASCII code value is an integer, so the character type is classified into the integer family) family

signed - signed: when the first bit represents the sign bit, it is signed

unsigned - unsigned: unsigned when each bit is a value bit and a significant bit

【Notice】:

When we do not write signed and unsigned, int, short and long types are signed by default

eg: When we write int a, the default is actually signed int type

but : C language does not stipulate whether char is signed char (this depends on the compiler, most of them are signed char)

(2) Floating point family: all can represent decimals

float //smaller precision, single precision

double //Higher precision, double precision

(3) Construction type (custom type)

> array type

> structure type struct

> enumeration type enum

> union type union

(4) pointer type

int *pi

char *pc

float *pf

void * pv (pointer with no concrete type)

(5) Empty type

void means empty type (no type)

Usually applied to the return type of the function, the parameter of the function, the pointer type

eg: int main (void) means that the main function does not require parameters

But in fact the main function is an int main(int argc, char *argv[ ], char *envp[ ]) with three parameters. These three parameters need to be written only when they are needed. You can write void directly without parentheses

2. Storage of shaping in memory

Computers can process binary data, and integers and floating-point types are also stored in binary form in memory.

1. Inverse code complement of original code

There are three types of binary representations of integers: original code, inverse code, and complement code

Positive integers: original code, inverse code, and complement code are the same

Negative integers: original code, inverse code, and complement code need to be calculated

Integers are stored in memory as two's complement binary sequences

eg：

int a = -10;//int type occupies 4 bytes -32bit
   10000000 00000000 00000000 00001010 original code
   11111111 11111111 11111111 11110101 inverse code
   1 1111111 11111111 1111111 1 11110110 complement code (the highest bit represents the sign bit, and the other 31 bits represent the value bits)

   unsigned int b = -10;
   1 1111111 11111111 11111111 11110110 Complement code (all 32 bits represent value bits)

For shaping, the data stored in the memory is actually the complement code

why?

Using the complement code, the sign bit and the value field can be processed uniformly; at the same time, addition and subtraction can also be processed uniformly ( cpu only has an adder ). In addition, the complement code and the original code are converted to each other, and the operation process is the same without additional hardware circuit

eg：

1-1
computer is converted into 1+(-1)
   00000000 00000000 00000000 00000001 The original inverse complement of 1
   10000000 00000000 00000000 00000001 The original code of -1
   11111111 11111111 11111111 11111110 -1's complement 11111111 11111111
   11111111 11111111 -1's complement
   if It is simply the addition of the original code to get -2 (I will hesitate whether to add the sign bit)
   but if it is the addition of the complement code, the correct result is obtained, and each bit is continuously increased by 1, and finally there is an extra bit at the front. The bit is 1 and discarded directly, and the other bits are 0

2. Big and small endian introduction

int a=0x11223344 (according to the data storage 44 is located in the low byte, 11 is located in the high byte)

Big endian storage:

Store the data at the low byte of a data at the high address of the memory , and store the data at the high byte at the low address of the memory

Little-endian storage:

Store the data at the low byte of a data at the low address of the memory , and store the data at the high byte at the high address of the memory

[Note]: When data is stored, the order of discussion is stored in bytes, so it is called big and small endian storage .

The char type does not need to consider the size of the end , the char type occupies one byte, there is no order at all

Why is there a big and small endian storage?

This is because in the computer system, we use bytes as the unit, and each address unit corresponds to a byte, and a byte is 8 bits. But in the C language, in addition to the 8-bit char, there are also 16-bit short types and 32-bit long types (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16-bit Or a 32-bit processor, since the register width is greater than one byte, there must be a problem of how to arrange multiple bytes . Therefore, it leads to big-endian storage mode and little-endian storage mode

Baidu written test questions:

Please briefly describe the concepts of big-endian and little-endian, and design a program to determine the byte order of the current machine

Ideas:

Give a variable a of int type : let it be 1 (such that the hexadecimal system is simply 0x 00 00 00 01 ), and then access one byte at a time through char* , and print it out to see whether it is 00 or 01, so as to judge the size of the end

Code:

#include<stdio.h>
int main()
{
	int a = 1;
	char* p = (char*)&a;   //要将&a（int *）强制转化为char *
	if (*p == 1)
		printf("小端\n");
	else
		printf("大端\n");
	return 0;
}

[Custom function to judge]:

#include<stdio.h>
int check_sys()
{
	int a = 1;
	return *(char*)&a;
}
int main()
{
	if(check_sys()==1)
		printf("小端\n");
	else
		printf("大端\n");
	return 0;
}

3. Practice

<1> What is the output of the following program?

#include <stdio.h>
int main()
{
char a= -1;
signed char b=-1;
unsigned char c=-1;

printf("a=%d,b=%d,c=%d",a,b,c);

return 0;
}

Answer:

-1 -1 255

explain:

First -1 is an integer, the original code: 10000000 00000000 00000000 00000001

Inverse code: 111111111 111111111 111111111 111111110

Complement: 111111111 111111111 111111111 111111111

But the char type has only 8 bits, so the complement is 111111111, and the first bit is the sign bit (for a and b)

%d is a signed integer printed in decimal form

Then it is necessary to carry out plastic enhancement (the high bit of unsigned number is filled with 0, and the high bit of signed number is filled with sign bit) (to improve the original code shaping)

For a and b: the complement code after plastic promotion is 111111111 111111111 111111111 111111111 (that is, -1)

For c: the complement code is 00000000 00000000 00000000 111111111 after shaping and upgrading (and because it is an unsigned shaping, the complement code is the same as the original code) (that is, 255)

<2> What is the output of the following program?

#include <stdio.h>
int main()
{
	char a = -128;
	printf("%u\n", a);
	return 0;
}

Answer:

4294967168

explain:

The original code of -128: 10000000 00000000 00000000 10000000

Inverse code: 111111111 111111111 111111111 011111111

Complement: 111111111 111111111 111111111 10000000

Store in the complement of a: 10000000 (1 is the sign bit)

Reshape and upgrade a: 111111111 111111111 111111111 10000000 (signed bit high bit complemented sign bit 1)

%u is an unsigned integer printed in decimal form

Then print as if a is an unsigned number, and the original inverse complement of the unsigned number is the same, just calculate it directly

<3> What is the output of the following program?

#include <stdio.h>
int main()
{
	char a = 128;
	printf("%u\n", a);
	return 0;
}

Answer:

4294967168

explain:

Although signed char can only be up to 127, it can still be assigned a value of 128, which can be truncated by itself

The original code of 128: 00000000 00000000 00000000 10000000

Store in the complement of a: 10000000 (1 is the sign bit)

Reshape and upgrade a: 111111111 111111111 111111111 10000000 (signed bit high bit complemented sign bit 1)

Print in decimal unsigned form

【Summarize】:

signed char：-128~127

char- Assumed to be a signed char (1 byte = 8bit) (the first bit is the sign bit) the first column is the original code

00000000 0

00000001 1

00000010 2

00000011 3

... ...

011111111 127

10000000 -128 11111111 (inverted) 110000000 (supplement: the extra digit should be deleted)

10000001 -127 11111110 111111111

...

111111110 -2 10000001 10000010

111111111 -1 10000000 10000001

Suppose it is unsigned char: 0~255

00000000

00000001 1

00000010 2

00000011 3

...

011111111 127

10000000 128

...

111111110 254

111111111 255

<4> What is the output of the following program?

int i= -20;
unsigned int j = 10;
printf("%d\n", i+j);

Answer:

-10

explain:

-20: Original code: 10000000 00000000 00000000 00010100

Inverse code: 111111111 111111111 111111111 11101011

Complement: 111111111 111111111 111111111 11101100

10: Original inverse complement code: 00000000 00000000 00000000 00001010 (the highest bit becomes the sign bit when adding)

Complement code for addition: 111111111 111111111 111111111 11110110 (complement code)

Inverse code: 10000000 00000000 00000000 00001001

Original code: 10000000 00000000 00000000 00001010 (-10)

<5> What is the output of the following program?

unsigned int i;
for(i = 9; i >= 0; i--)
{
printf("%u\n",i);
}

Answer:

9 to 0 and then to 4294967295, keeps decreasing, infinite loop

explain:

The range of unsigned int is >=0, so the judgment condition of the for loop is always true, analogous to unsigned char, when 0 continues to decrease, it reaches 255, and the same is true for unsigned int

<6> What is the output of the following program?

int main()
{
char a[1000];
int i;
for(i=0; i<1000; i++)
{
a[i] = -1-i;
}
printf("%d",strlen(a));
return 0;
}

Answer:

255

explain:

strlen counts the number of characters before \0 (that is, 0)

a[ i ] contains -1, -2, -3...-128 127 ...6 5 4 3 2 1 0

A total of 128+127=255 numbers

<7> What is the output of the following program?

#include <stdio.h>
unsigned char i = 0;
int main()
{
for(i = 0;i<=255;i++)
{
printf("hello world\n");
}
return 0;
}

Answer:

Infinite loop

explain:

The range of unsigned char is 0-255, the condition of the for loop is always true, and it enters an infinite loop

3. Storage of floating-point types in memory

Common floating point numbers:

3.14159

1E10 (that is, 1.0*10^10)

The floating-point family includes: float, double, long double types

The range of floating-point numbers: defined in float.h

1. An example

int main()
{
int n = 9;
float *pFloat = (float *)&n;
printf("n的值为：%d\n",n);
printf("*pFloat的值为：%f\n",*pFloat);
*pFloat = 9.0;
printf("num的值为：%d\n",n);
printf("*pFloat的值为：%f\n",*pFloat);
return 0;
}

result:

2. Floating-point number storage rules

Any binary floating-point number V can be expressed in the following form :

eg: Convert 5.5 in decimal to binary

101.1 (one digit after the decimal point is 2 to the power of -1, which is 0.5)

The binary floating-point number representation is (-1)^0*1.011*2^2 (the decimal point is advanced by two digits, that is, *2^2 (binary), if it is decimal, it is 2^10)

Get: S=0, M=1.011, E=2

For 32-bit floating-point numbers, the highest 1 bit is the sign bit s, the next 8 bits are the exponent E , and the remaining 23 bits are the effective number M

For 64-bit floating-point numbers, the highest 1 bit is the sign bit S, the next 11 bits are the exponent E , and the remaining 52 bits are the effective number M

Storage of significant digit M:

For the effective number M, 1<=M<2, when saving M in the computer, the default digit before the decimal point is 1, so only the digits after the decimal point are saved , which saves the space of one digit, with 32 digits For example, although there are only 23 bits left for M, it is equivalent to saving 24 significant figures

Storage of significant digit E:

First, E is an unsigned number. If E is 8 bits, its value range is 0-255; if E is 11 bits, its value range is 0-2047. When storing the actual value of E, an intermediate value must be added . For 8-bit E, the intermediate value is 127. For 11-bit E, the intermediate value is 1023.

eg: The E of 2^10 is 10, so when saving a 32-bit floating point number, it must be saved as 10+127=137, that is, 10001001

The index E can be further divided into three cases when it is taken out of the memory:

(1) E is not all 0 or not all 1:

Subtract 127 (or 1023) from the calculated value of the index E to get the real value, and then add 1 in front of the M decimal point

eg：

The binary value of 0.5 is 0.1, and the floating-point number representation: 1.0*2^(-1), E is stored as -1+127=126, which is 01111110, and the mantissa 1.0 minus 1 is 0, then the binary representation of 0.5 is:

0 01111110 00000000000000000000000

(2) E is all 0:

At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023) which is the real value

At this time, M does not need to add 1 in front of the decimal point, but restores it to a decimal of 0.xxxx. This is done to represent plus or minus 0, and very small numbers close to 0.

(3) E is all 1:

At this time, if the significant number M is all 0, it means ± infinity (positive or negative depends on the sign bit s)

Now let's explain the previous example again:

From the 9 of type int:

int n=9；

00000000 00000000 00000000 00001001 (int type binary)

But when it casts the type to float*, the meaning of the representation is different

0 00000000 00000000000000000001001

At this time, E is all 0, then E=-126, M does not need to add 0, that is, M=0.00000000000000000001001, S=0

Then *pFloat is (-1)^0*0.00000000000000000001001*2^(-126), this number is extremely small, and it will be printed directly as 0.000000 (float prints 6 digits after the decimal point)

From the perspective of float type 9.0: (when *pFloat=9.0 and later)

9.0（1001.0）

Floating point representation: (-1)^0*1.001*2^3

Binary representation: 0 10000010 00100000000000000000

Then print in the form of %d: From the perspective of n, this is the complement code, the sign bit is 0, which is a positive number, the original inverse complement code is the same, and converted to decimal is 1091567616

This is the end of the content for this time. Welcome to the comment area or private message communication. I think the author’s writing is okay, or I have gained a little bit. Please, please move your little hands and give me a one-click triple link. Thank you very much!