[C Language] The principle of storing integer data in memory (with examples and analysis)

Preface

Coding language: C language
Development environment: Visual Studio 2022
Through this article, you will learn:

  1. Integer family members;
  2. How integers are stored in memory;
  3. Storage in endian order.

1. Data type

1.1. Common data types

Everything

1.1.1. Classification of basic data types

The following are the classifications ofbasic data types:
Everything
Explanation of the above figure:

  1. signed means signed, unsigned means unsigned, the int type usually written is the signed int type, and so on. signed can usually be omitted.
  2. [int] means that the int can be omitted. For example, for unsigned short int type, int can be omitted, which is equivalent to unsigned short type.
  3. The char type is a character type. When characters are stored, the ASCII code value is stored. The ASCII code value is an integer, so the char type is classified into the integer family.
  4. Each member of the integer family only has a different value range for storing data, but they are all integers.. Since it is an integer, it can represent positive integers, 0 and negative integers. Data types with unsigned modification cannot represent negative integers.

1.1.2. The size of storage space occupied by basic data types

Everything
Explanation of the above image pair:

1. The storage space of long integer is different under different compilers. The C language only stipulates:The size of the space occupied by the long type >= The size of the space occupied by the int type ( sizeof(long) >= sizeof(int) ).
2. The C language also clearly stipulates that the long long type is 8 bytes, the float type is 4 bytes, and the double type is 8 bytes.

1.2. Why are there different data types?

  1. Different types occupy different sizes of space, depending on the specific dataAllocate space reasonably
  2. Different data types determine the size of the memory space allocated.The size of the memory space is different, and the value range of the stored data is also different.
  3. Different data types have different perspectives on memory space.

2. Storage of integer data in memory

To create a variable, you must open up a space for the variable. The size of this space is determined by the data type. After creating a variable, the next step is to store the data. 10How to store? How to store -10? The following is how the data of the integer family is stored.

2.1. Computer data storage

In a computer, any data isStored in binary form

2.1.1. Common base systems

The most common decimal numbers: composed of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Enter 1 every 10

Binary numbers: composed of 0,1. Every 2 goes into 1

Hexadecimal number: composed of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f. Every hexadecimal number is 1
If hexadecimal numbers are represented by 10,11,12,13,14,15, they will be ambiguous with the previous numbers from 0 to 5, so from 10 to 15, respectively Represented by a, b, c, d, e, f.

Octal numbers: composed of 0,1,2,3,4,5,6,7. Every 8 goes into 1

To prevent confusion between hexadecimal 1, binary 1, octal 1 and decimal 1, a hexadecimal number will be prefixed with 0x. An octal number is preceded by a prefix 0.
There is no prefix in C language that directly represents binary.

Example:
The 0x in 0x1234 has no other meaning except that 1234 is a hexadecimal number.
The 0 in 01234 is the octal symbol.
1234 without any prefix is ​​1234 in decimal. The binary of 1234 is 010011010010. See the next section for details on how to convert.

2.1.2. Base conversion method

Draw inferences through two examples

  1. Convert any base to decimal
    Everything
    Everything

  2. Convert decimal to arbitrary base (short division)
    Everything
    Everything

  3. Use the base conversion calculator to calculate. Here we use the calculator that comes with Windows.
    Open the Windows Calculator, click Open Navigation in the upper left corner, and select the programmer of the calculator.
    Everything
    HEX is hexadecimal
    DEC is decimal
    OCT is octal
    BIN is binary
    Click DEC and enter 418 to see the representation of each base
    Insert image description here

2.2. Original code, inverse code, complement code

Original code: binary representation of decimal data. In order to distinguish positive and negative numbers, the highest bit (leftmost) is defined as the sign bit (0 is positive, 1 is negative), and the other bits are numerical bits.

Take a byte size (8 bits) as an example: 00010101 is the original code of 21, 10010101 is the original code of -21
Everything
Now I want 21 and -21 to be If the original codes are added, will they equal 0?
Everything
The binary number of 0 should be 00000000, but the result of the above calculation is obviously not. Try adding 21 to 1:
Everything
It’s okay to add positive numbers, and then look at adding negative numbers, such as adding -21 and -1:

Everything
-21+(-1) should be -22, but it turns out to be 22. You can know from these 3 examples:

The correct result cannot be calculated when the original code is a negative number.

The existence of the complement code is to solve the problem of negative number calculation.

One's complement code: The one's complement of a positive number is the original code itself, the one's complement of a negative number leaves the sign bit unchanged and the remaining bits are inverted.

For example (take the size of one byte as an example):

Everything
Because the complement code of a positive number is the same as the original code, positive numbers can also be calculated. Then we try to use the complement code to calculate negative numbers:
Everything
It seems that it can be calculated, but there are special cases:
Everything
Why is it smaller by 1? Because 0 is counted once. 0 has +0 and -0. The complement of +0 is the original code itself: 00000000. The original code of -0 is 10000000, so the complement of -0 is 11111111. Because 0 is counted twice, there is a missing 1. It also reflects the shortcomings of inverse code calculation: the result of negative number calculation spans 0, and there will be a deviation of 1.The emergence of the complement code is to solve the problem of cross-zero calculation.

Complement code: The original code of a positive number, the complement code and the complement code are the same. The complement code of a negative number is the complement code plus 1.
A picture shows the original code, complement and complement of negative numbers:
Insert image description here
At this time, whether it is +0 or -0, it is 00000000, which does not affect the calculation.

2.2.1. The significance of the original complement code

In fact, integer data in memory areStored in two’s complement format. When we input an integer, the computer first converts the integer into binary form, which is the original code, and then converts it into complement form for calculation.
Why do we need to convert it to two's complement and then calculate it? Isn't it troublesome?
You can directly calculate positive and negative integers using complement codes, and the CPU only has adders, which means it can only perform addition and subtraction operations (subtraction can be achieved by adding a negative number), andOriginal code to complement code and complement code to original code can be converted by inverting and adding 1, and there is no need to use additional machines for conversion. On the contrary, the efficiency is higher.
Everything
Finally, let’s look back at how 10 and -10 are stored:
Everything
We can see where 10 and -10 are stored in memory in the development environment: a>
Everything
Essentially, the memory stores binary, but for the convenience of display, the environment (vs2022) displays hexadecimal (note that the data in the memory must be read backwards, the reason is explained in the endian order.)The space size of variables num and num2 is both 4 bytes (that is, 32 bits), because one hexadecimal number needs to be represented by 4 binary digits, and 8 hexadecimal digits require 32 binary digits. means, so 00 00 00 0a in hexadecimal is 00000000 00000000 00000000 00001010 in binary, which is 10.

Explanation of drawing:
Everything
-10Similarly:
Everything
Everything

2.3. Value range stored in integer type

List the data types of some integer families:
char: -128 to 127 (signed)
unsigned char: 0 to 255 (unsigned) symbol)

short: -32,768 to 32,767 (signed)
unsigned short: 0 to 65535 (unsigned)

int: -2,147,483,648 to 2,147,483,647 (signed)
unsigned int: 0 to 4294967295 (unsigned)

long: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (signed)
unsigned long: 0 to 18446744073709551615 (unsigned)

How do you get the value range of ? Take the char type as an example:
The size of the char type is 1 byte (8 bits)
Everything
+1 reincarnation:
Everything

3. Big and small endian storage

The unit of memory is 1 byte. If you need to store an integer data of multiple bytes, you need multiple memory units to store this data. So how to store it? Save it forward or backward? This is a problem. In fact, different machines store data in different orders. Here you need to understand the concept of big and small endian.

3.1. What is endianness?

1. Endianness: Discuss storage order in bytes. The char type does not discuss order, because the data of the char type only occupies one byte.
2. Big endian byte order storage: store a dataThe content of the low-order byte is stored at the high address, put this data intoThe content of the high-order byte is stored at the low address.
3. Little endian byte order storage: store a dataThe content of the low-order byte is stored at the low address, put this data intoThe content of the high-order byte is stored at the high address

For example, store a number 0x11223344 in a variable of type int:
Everything

3.2. Test whether the current machine is big endian or little endian?

Create an integer variable a and assign 1 to a. The memory will open up a continuous 4-byte space for variable a and store 1. If the machine is little endian, then when reading the lowest byte of variable a, the value read should be 1, otherwise the machine is big endian.
Everything
Why does the address a of type (int*) need to be forced to type (char*)?
Dereference the address of type (int*) and The size of the object accessed by this address is 4 bytes;
Dereferences the (char*) type address, and the size of the object accessed by this address is 1 byte;
And we only need to check whether the first byte pointed to by the address of variable a is 1. We do not access the next 3 bytes of data, so we force the (int*) type of address a to (char*) type.
Everything
The code is implemented as follows:

#include<stdio.h>
//小端返回1,大端返回0
int check_system()
{
    
    
	int a = 1;
	return *(char*)&a;
}

int main()
{
    
    
	int ret = check_system();
	if(ret ==1)
		printf("小端");
	else
		printf("大端");
	return 0;
}

4. Examples and analysis

Example 1:

#include<stdio.h>
int main()
{
    
    
	char a = -1;
	signed char b = -1;
	unsigned char c = -1;
	printf("a=%d b=%d c=%d", a, b, c);
	return 0;
}

Example 1 analysis:
Everything

Example 2:

#include<stdio.h>
int main()
{
    
    
	char a = -128;
	char b = 128;
	printf("%u,%u", a,b);
	return 0;
}

Example 2 analysis:
Everything
Everything

Example 3:

#include<stdio.h>
int main()
{
    
    
	int i = -20;
	unsigned int j = 10;
	printf("%d\n", i + j);
	return 0;
}

Example 3 analysis:
Everything
Example 4:

#include<stdio.h>
int main()
{
    
    
	unsigned int i;
	for (i = 9; i >= 0; i--)
	{
    
    
		printf("%u\n", i);
	}
	return 0;
}

Example 4 analysis:
The value range of unsigned int type: 0 to 4294967295. Negative numbers are impossible, so i>=0 is always true, and an infinite loop occurs. .
Subtracting 1 from 0 will become a binary sequence composed of 32 1s. Directly converted into a binary sequence, it is 4294967295, then subtracting one by 1, and finally reducing it to 0 and returning to 4294967295...

Example 5:

#include<stdio.h>
#include<string.h>
int main()
{
    
    
	char a[1000];
	int i;
	for (i = 0; i < 1000; i++)
	{
    
    
		a[i] = -1 - i;
	}
	printf("%d", strlen(a));
	return 0;
}

Example 5 analysis:
Everything
Example 6:

#include<stdio.h>
int main()
{
    
    
	unsigned char i=0;
	for (i = 0; i <= 255; i++)
	{
    
    
		printf("hello world\n");
	}
	return 0;
}

Example 6 analysis:
Infinite loop, because i<=255 is always true, unsigned char type variables can only store one byte, and 255 is the maximum value that can be represented by one byte The number is 255 plus 1 because the 9th digit cannot be stored, and the first 8 digits become 0...

Guess you like

Origin blog.csdn.net/weixin_73276255/article/details/131561806