Data storage in memory (1)

Introduction to Data Types

char //character data type
short //short integer
int //integer
long //long integer
long long //longer integer
float //single-precision floating-point number
double //double-precision floating-point number

This is a built-in type of C language. Different types have different storage sizes in memory, such as:
insert image description here
Reasonable use of different types can make the reasonable allocation of memory space. In other words, the existence of so many types is actually to express various values ​​in life more abundantly.

Basic Classification of Types

Integer family:

char
 unsigned char
 signed char
short
 unsigned short [int]
 signed short [int]
int
 unsigned int
 signed int
long
 unsigned long [int]
 signed long [int]

unsigned means unsigned, and sign means signed. In layman's terms, it means whether there are negative numbers; generally speaking, the default is signed.

Floating point family:

float
double

Construction type:

Array type
Structure type struct
enumeration type enum
joint type union

Pointer type:

int pi;
char pc;
float
pf;
void
pv;

empty type:

void means empty type (no type)
usually applied to function return type, function parameter, pointer type.

In-memory storage of integers

As we said before, a variable will open up a space in memory for storage, so how is it stored?

Original code, inverse code, complement code

For an integer, the computer must use a certain encoding method to store it. The original code, inverse code, and complement code are the encoding methods for the machine to store a specific number. These representation methods all have two parts, the sign bit and the value bit, and the sign bit is used0 means "positive", 1 means "negative".
forA positive numberFor example, the original code, the inverse code, and the complement code are allsame.
fornegative numberFor example, original code, inverse code, complement codedifferent

Original code: The original code can be obtained by directly translating the value into binary in the form of positive and negative numbers.
Inverse code: The sign bit of the original code remains unchanged, and the other bits are sequentially inverted to obtain the inverse code.
Complementary code: Add the inverse code + 1 to get the complement code.

In integer memory, the computer stores in two's complement.

In computer systems, values ​​are always expressed and stored in two's complement. The reason is that, using the complement code, the sign bit and the value field can be
processed uniformly;
at the same time, addition and subtraction can also be processed uniformly (the CPU only has an adder). In addition, the complement code and the original code are converted to each other, and the operation process is the same. No additional hardware circuitry is required.

#include<stdio.h>
int main()
{
    
    
    int num1=10;
    //创建一个整型变量,该变量向内存申请了4个字节(32bit)
    //00000000000000000000000000001010--原码
    //00000000000000000000000000001010--反码
    //00000000000000000000000000001010--补码
    int num2=-10;
    //10000000000000000000000000001010--原码
    //11111111111111111111111111110101--反码
    //11111111111111111111111111110110--补码
    return 0;
}

big endian vs little endian

Big-endian (storage) mode means that the low bits of data are stored in the high address of the memory, while the high bits of the data are stored in the low address of the memory; little endian (storage) mode means that the low bits of data are stored in the low bits of the
memory address, while the high bits of the data are stored in the high address of the memory.

As for why there are big and small ends, this is because in the computer system, we use bytes as the unit, and each address unit corresponds to a byte, and a byte is 8 bits. But in the C language, in addition to the 8-bit char, there are also 16-bit short types and 32-bit long types (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16-bit Or a 32-bit processor, since the register width is greater than one byte, there must be a problem of how to arrange multiple bytes. Therefore, it leads to big-endian storage mode and little-endian storage mode. Now the representation of the address is in hexadecimal, soBig end and little end generally only appear in hexadecimal.
See the following example:

int main()
{
    
    
	int a = 0x11223344;
	//表示变量a的内存地址
	return 0;
}

In the memory address we can see that the storage format is: 0x44332211.
insert image description here

In the compilers we use, most are little-endian.

Look at this example again:

#include <stdio.h>
int check_sys()
{
    
    
 int i = 1;
 return (*(char *)&i);
}
int main()
{
    
    
 int ret = check_sys();
 if(ret == 1)
 {
    
    
 printf("小端\n");
 }
 else
 {
    
    
 printf("大端\n");
 }
 return 0;
}

This is a function that can judge the big and small ends. First, take the address of i, which is (0x01000000) in VS. If it is cast to char*, it will become (0x01), and it will become 1 when it is dereferenced . So the answer is little endian, conversely, if the answer is 0 then it is big endian.

some exercises

Let's look at the following code:

#include <stdio.h>
int main()
{
    
    
    char a= -1;
    signed char b=-1;
    unsigned char c=-1;
    printf("a=%d,b=%d,c=%d",a,b,c);
    return 0;
}

Result: a=-1, b=-1, c=255

We know that a char type variable occupies one byte, that is, 8 bits; for a signed char type, since the sign bit occupies one bit, there are only 7 bits left, and the maximum number can only reach 127 (011111111) , when adding 1, it will become (10000000). The computer stipulates that this number is -128, and when it increases again, it will become -127... until -1 returns to 0, and 0 returns to 1, forming a closed loop.
insert image description here
For the unsigned char type, the storage size is directly 0–255.
For this question, there is actually aimplicit type conversion(char and short integer operands in expressions are converted to normal types before use, this conversion is called integer promotion )

#include <stdio.h>
int main()
{
    
    
    char a= -1;
    //10000000000000000000000000000001--原码
	//11111111111111111111111111111110--反码
	//11111111111111111111111111111111--补码-截断
	//11111111 -a
	//整型提升:char为有符号位的,高位补充符号位,即为1
	//11111111111111111111111111111111
	//11111111111111111111111111111110
	//10000000000000000000000000000001--> -1
    signed char b=-1;
    //与a同样的道理
    unsigned char c=-1;
    //11111111 -c
    //char c是无符号位的,高位直接补充0,表示正数
	//00000000000000000000000011111111
    printf("a=%d,b=%d,c=%d",a,b,c);
    return 0;
}

Question 2:

#include <stdio.h>
int main()
{
    
    
    char a = -128;
    printf("%u\n",a);
    //%u打印无符号的十进制数字
    return 0;
}

Result: 4294967168
-128
1000000000000000000000010000000–original code
111111111111111111111110111111111111111111111111111111111111111111111111111111111111110000000–
Complement -
128's complement - 10000000
111111111111111111111110000000 – Integer promotion

Question 3

#include<windows.h>
#include<stdio.h>
int main()
{
    
    

unsigned int i;
for(i = 9; i >= 0; i--)
{
    
    
    printf("%u\n",i);
    Sleep(1000);
    //为了显示效果更加明显,sleep表示减缓打印速度
}
return 0;
}

insert image description here
Here, this is an endless loop. For us, 0 minus 1 is -1, but since it is an unsigned bit (111111...) it becomes more than 4 billion. That is to say,i is unsigned and will never be less than 0.

Question 4:

int main()
{
    
    
    char a[1000];
    int i;
    for(i=0; i<1000; i++)
   {
    
    
        a[i] = -1-i;
   }
    printf("%d",strlen(a));
    return 0;
}

Result: 255

We know that strlen is a function to calculate the length of a string, and when '\0' is recognized, the access will end. The ASCII code value of '\0' is 0, and the array a is of char type.
insert image description here
So the length of char is 255.

Guess you like

Origin blog.csdn.net/m0_74068921/article/details/130918153