I can "C" - data storage

 Table of contents

  

1. Introduction to data types

1.1 Basic classification of types:

2. Shaping is stored in memory 

2.1 Original code, inverse code, complement code

2.2 Big and small endian introduction

2.3 Exercises  

3. Storage of floating-point types in memory 

3.1 An example  

3.2 Floating-point number storage rules  


1. Data type introduction

char         // Character data type
short       // short integer
int         // shaping
long         // long integer
long long   // longer shaping
float       // single precision floating point number
double       // double precision floating point number
// Is there a string type in C language?
The meaning of the type:
1. Use this type to open up the size of the memory space (the size determines the scope of use).
2. How to look at the perspective of memory space.

1.1 Basic classification of types:

Plastic family:
char
        unsigned char //only put positive numbers
        signed char //put positive and negative numbers
//ASCII values ​​are essentially used when storing and representing characters. ASCII values ​​are integers, and character types are also classified into integer families.
The int we usually use is equivalent to signed int, and signed can be omitted.
short
        unsigned short [ int ]
        signed short [ int ]
int
        unsigned int
        signed int
long
        unsigned long [ int ]
        signed long [ int ]

 Floating point family:

float
double

Constructed type (custom type):

> array type
> structure type struct
> enumeration type enum
> union type union

 pointer type

int * pi ;
char * pc ;
float* pf ;
void* pv ;

empty type: 

void represents an empty type (no type)
Usually applied to the return type of the function, the parameter of the function, the pointer type.
void test(...)//The function does not need to return a value
{}
void test(void)//The function does not require parameters
{}
void* p;//No specific type pointer

2. Storage of shaping in memory 

We said before that the creation of a variable is to open up space in memory. The size of the space is determined according to different types. Then let's talk about how the data is stored in the allocated memory? 

for example: 

int a = 20 ;
int b = - 10 ;
We know that four bytes of space are allocated for a .
How to store it?
Come down and understand the following concepts:

2.1 Original code, inverse code, complement code

There are three binary representation methods for integers in computers , namely original code, complement code and complement code.
The three representation methods all have two parts, the sign bit and the value bit . The sign bit uses 0 to represent " positive " , 1 to represent " negative " , and the value bit
The original, inverse and complement of positive numbers are the same.
There are three different ways of representing negative integers.
original code
The original code can be obtained by directly translating the value into binary in the form of positive and negative numbers.
inverse code
The sign bit of the original code remains unchanged, and the other bits are sequentially inverted to obtain the inverse code.
Complement
Inverse code + 1 will get the complement code.

For shaping: the data stored in the memory is actually the complement code.

why?

In computer systems, values ​​are always expressed and stored in two's complement. The reason is that, using the complement code, the sign bit and the value field can be processed uniformly; at the same time, addition and subtraction can also be processed uniformly (the CPU only has an adder ). In addition, the operation process of the complement code and the original code is the same. Additional hardware circuitry is required.

Let's look at the storage in memory:

We can see that the complements are stored for a and b respectively. But we found that the order was a bit off .
Why? This involves big and small ends again, look down~

2.2 Big and small endian introduction

 What big endian little endian:

Big-endian (storage) mode means that the low bits of the data are stored in the high addresses of the memory, while the high bits of the data are stored in the low addresses of the memory
middle;
The little-endian (storage) mode means that the low bits of data are stored in the low addresses of the memory, while the high bits of the data are stored in the high places of the memory
address.

 

 

Why there is big endian and little endian:

Why is there a difference between big and small endian modes? This is because in the computer system, we use bytes as the unit, and each address unit corresponds to a byte, and a byte is 8 bits . But in the C language, in addition to the 8-bit char , there are also 16-bit short types and 32-bit long types (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16 -bit Or for a 32-bit processor, since the register width is greater than one byte, there must be a problem of how to arrange multiple bytes. Therefore, it leads to big-endian storage mode and little-endian storage mode.
For example: a 16bit short type x , the address in the memory is 0x0010 , the value of x is 0x1122 , then 0x11 is the high byte, and 0x22 is the low byte. For the big-endian mode, put 0x11 in the low address, that is, 0x0010 , and put 0x22 in the high address, that is, 0x0011 . Little endian mode, just the opposite. Our commonly used X86 structure is little-endian mode, while KEIL C51 is big-endian mode. Many ARMs and DSPs are in little-endian mode. Some ARM processors can also choose the big-endian mode or the little-endian mode by hardware.

 Baidu 2015 system engineer written test questions:

Please briefly describe the concepts of big-endian and little-endian, and design a small program to determine the byte order of the current machine. ( 10 points)
// code 1
#include <stdio.h>
int check_sys ()
{
int i = 1 ;
return ( * ( char * ) & i );
}
int main ()
{
int ret = check_sys ();
if ( ret == 1 )
{
printf ( " little endian \n" );
}
else
{
printf ( " big endian \n" );
}
return 0 ;
}
// Code 2
int check_sys ()
{
union
{
int i ;
char c ;
} and ;
one _ and = 1 ;
return un . c ;
}

2.3 Exercises 

1.
// output what?
#include <stdio.h>
int main ()
{
    char a = - 1 ;
//10000000000000000000000000000001
//11111111111111111111111111111110
//11111111111111111111111111111111
//11111111 - truncated
//Integer promotion - promote according to the sign bit
//11111111111111111111111111111111
//1111111111111111111111111111110 - minus 1
//10000000000000000000000000000001
    signed char b =- 1 ;
    unsigned char c =- 1 ;
//1000000000000000000000000000001
//11111111111111111111111111111110
//11111111111111111111111111111111
//11111111 - If it is an unsigned number, directly add 0 to the high bit
//00000000000000000000000011111111 is converted to 255 in decimal
//The sign bit of the highest bit after 0 is completed is 0, so the original, inverse and complement codes are the same
    printf ( "a=%d,b=%d,c=%d" , a , b , c );
    return 0 ;
}

The structure is a = -1 b = -1 c = 255

 

 

 

What does the following program output?

2.
#include <stdio.h>
int main ()
{
    char a = - 128 ;
    printf ( "%u\n" , a );
    return 0 ;
}

//10000000000000000000000010000000 - original code

//1111111111111111111111101111111 - inverse

//1111111111111111111111110000000 - complement

//10000000 - a truncation

——>Integer Boost Complement 1

//11111111111111111111111110000000

//%u printing thinks that the printed complement is the same as the original reverse complement for unsigned numbers

So direct printing (converted to decimal) prints more than 4.2 billion

3.
#include <stdio.h>
int main ()
{
    char a = 128 ;
    printf ( "%u\n" , a );
    return 0 ;
}

 It is exactly the same as the previous one, because the truncation is the same as filling 1

4.
int i = - 20 ;
unsigned   int   j = 10 ;
printf ( "%d\n" , i + j );
// Operate in the form of two's complement, and finally format it into a signed integer
Detailed explanation:
//10000000 00000000 00000000 00010100 The original code of negative 20
//11111111 11111111 11111111 11101011 The negative code of negative 20 (the sign bit remains unchanged, and the others are reversed) //11111111 11111111 1111111 1 11101100
Negative 20's complement (add 1 to the complement to get the complement Code)
//00000000 00000000 00000000 00001010 The original code of 10 (the original, inverse, and complement of positive numbers are the same)
//11111111 11111111 11111111 11110110 The addition of -20 and 10's complement
// (the result of the computer is stored in the memory It is the complement code)
//11111111 11111111 11111111 11110101 (minus 1)
//10000000 00000000 00000000 00001010 (reverse) to get -10
5.
unsigned int i ;
for ( i = 9 ; i >= 0 ; i -- )
{
    printf ( "%u\n" , i );
}

//-1 of the original supplement
// 1000000000000000000000000000000001 Original
// 111111111111111111111111111111110 Anti-anti
// 11111111111111111111111111111 fill
// When the cycle I-0, the reducing is -1 again, and -1 of -1 The supplement is 1111111111111111111111111111111111, a computing opportunity Think it is a very large number, so it keeps looping.

 %u prints an unsigned number

But if I change to %d, I can print -1

6.
int main ()
{
    char a [ 1000 ];
    int i ;
    for ( i = 0 ; i < 1000 ; i ++ )
  {
        a [ i ] = - 1 - i ;
  }
    printf ( "%d" , strlen ( a ));
    return 0 ;
}
    //-1 -2 -3 -4...-127...-998 -999 -1000
    //char -1 -2 -3...-128 127 126...3 2 1...0 -1 -2...-128 127
//strlen finds the length of the string, looking for \0, the ASCII code value of \0 is 0, so the calculation of the length of char will stop when it reaches 0, so 128+127=255
The result will print 225
The value range of char type is -128~127

7.
#include <stdio.h>
unsigned char i = 0 ;
//0~255
int main ()
{
    for ( i = 0 ; i <= 255 ; i ++ )
  {
        printf ( "hello world\n" );
//The conversion of 256 in decimal to binary is 1 00000000. Doesn’t the following eight digits become 0, so the condition of i<=255 is always true, so the infinite loop
  }
    return 0 ;
}
So the result is an infinite loop printing hello world

3. Floating-point storage in memory 

Common floating point numbers:

3.14159
1E10
Floating point family includes: float , double , long double types.
The range of floating point numbers: defined in float.h

3.1 An example 

Example of floating point storage:
int main ()
{
int n = 9 ;
float * pFloat = ( float * ) & n ;
printf ( "The value of n is: %d\n" , n );
printf ( "The value of *pFloat is: %f\n" , * pFloat );
* pFloat = 9.0 ;
printf ( "The value of num is: %d\n" , n );
printf ( "The value of *pFloat is: %f\n" , * pFloat );
return 0 ;
}

The value of n is: 9 --> is printed in the form of an integer

The value of *pFloat is: 0.000000 --> the result obtained in the form of floating point number is not 9.0, indicating that the storage form of integer is different from that of floating point number

The following two similarities again verify that the storage form of integers and floating-point numbers are different.

What is the result of the output?

3.2 Floating-point number storage rules 

num and *pFloat are obviously the same number in memory, why is there such a big difference between the interpretation results of floating point numbers and integers?
To understand this result, you must understand how floating-point numbers are represented inside the computer.
Detailed interpretation:
According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754 , any binary floating-point number V can be expressed in the following form:
  • (-1)^S * M * 2^E
  • (-1)^S represents the sign bit, when S=0 , V is a positive number; when S=1 , V is a negative number.
  • M represents a valid number, greater than or equal to 1 and less than 2 .
  • 2^E means exponent bits.
  • Example: 5.5 in binary form  -->

As shown in principle 5.5, convert to 101.1 scientific notation 1.011*2^2

 

for example:
5.0 in decimal is 101.0 in binary , which is equivalent to 1.01×2^2 .
Then, according to the format of V above , it can be concluded that S=0 , M=1.01 , and E=2 .
Decimal -5.0 is written as -101.0 in binary , which is equivalent to -1.01 × 2^2 . Then, S=1 , M=1.01 , E=2 .
IEEE 754 states:
For 32 -bit floating-point numbers, the highest 1 bit is the sign bit s , the next 8 bits are the exponent E , and the remaining 23 bits are the effective number M.

For 64-bit floating-point numbers, the highest 1 bit is the sign bit, the next 11 bits are the exponent E, and the remaining 23 bits are the significand M.

IEEE 754 has some special regulations on the significant number M and exponent E.
As mentioned earlier, 1≤M<2 , that is to say, M can be written in the form of 1.xxxxxx , where xxxxxx represents the decimal part.
IEEE 754 stipulates that when M is saved inside the computer , the first digit of this number is always 1 by default , so it can be discarded, and only the xxxxxx part that follows is saved.
For example, when saving 1.01 , only save 01, and then add the first 1 when reading. The purpose of doing this is to save 1 significant figure. Take the 32 -bit floating-point number as an example, there are only 23 bits left for M , and after the first 1 is discarded, it is equal to saving 24 significant figures.
As for the index E , the situation is more complicated. Down
First, E is an unsigned integer ( unsigned int )
That means, if E is 8 bits, its value range is 0~255 ; if E is 11 bits, its value range is 0~2047 . However, we know that E in scientific notation can have negative numbers, so IEEE 754 stipulates that an intermediate number must be added to the real value of E when stored in memory . For 8- bit E , this intermediate number is 127 ; For E of 11 bits , this intermediate number is 1023 . For example, the E of 2^10 is 10 , so when saving it as a 32- bit floating point number, it must be saved as 10+127=137 , that is, 10001001.
Then, the index E is fetched from the memory and can be further divided into three cases:
E is not all 0 or not all 1
At this time, the floating-point number is represented by the following rules, that is, the calculated value of the exponent E is subtracted by 127 (or 1023 ) to obtain the real value, and then the
Add the first digit 1 before the significant digit M.
for example:
The binary form of 0.5 ( 1/2 ) is 0.1 , since it is stipulated that the positive part must be 1 , that is, the decimal point is shifted to the right by 1 , then it is
1.0*2^(-1) , its order code is -1+127=126 , expressed as
01111110 , and the mantissa 1.0 removes the integer part to be 0 , fills 0 to 23 digits 00000000000000000000000 , then its binary
The representation form is :

 0 01111110 00000000000000000000000

E is all 0

At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023 ) which is the real value,
Significant number M no longer adds the first 1 , but restores to the decimal of 0.xxxxxx . This is done to represent ±0 , as well as close to
0 is a very small number.

 E is all 1

At this time, if the significant digit M is all 0 , it means ± infinity (positive or negative depends on the sign bit s );

Well, that's all for the representation rules of floating point numbers.

To explain the previous question:

int main()
{
int n = 9;
float *pFloat = (float *)&n;
printf("n的值为:%d\n",n);
printf("*pFloat的值为:%f\n",*pFloat);
*pFloat = 9.0;
printf("num的值为:%d\n",n);
printf("*pFloat的值为:%f\n",*pFloat);
return 0;
}

Next, let's go back to the original question: Why does 0x00000009 become 0.000000 when it is restored to a floating point number ?
First, split 0x00000009 to get the first sign bit s=0 , and the exponent of the next 8 bits E=00000000 ,
The last 23 significant digits M=000 0000 0000 0000 0000 1001 .

9 -> 0000 0000 0000 0000 0000 0000 0000 1001  

Since the exponent E is all 0 , it meets the second case in the previous section. Therefore, the floating-point number V is written as:

  V=(-1)^0 × 0.00000000000000000001001×2^(-126)=1.001×2^(-146)  

Obviously, V is a very small positive number close to 0 , so it is 0.000000 in decimal notation . 

Look at the second part of the example.

May I ask the floating point number 9.0 , how to use binary representation? What is the reduction to decimal?
First, the floating point number 9.0 is equal to 1001.0 in binary , which is 1.001×2^3 .
9.0 -> 1001.0 -> ( - 1 ) ^01 . 0012 ^3 -> s = 0 , M = 1.001 , E = 3 + 127 = 130
Then, the first sign bit s=0 , the effective number M is equal to 001 followed by 20 0s , making up 23 bits, and the exponent E is equal to 3+127=130 ,
That is 10000010 .
Therefore, written in binary form, it should be s+E+M , ie
0 10000010 001 0000 0000 0000 0000 0000

This 32 -bit binary number, restored to decimal, is exactly 1091567616 . 

THE END

        This is some sharing about data storage today, I hope it can help everyone! If there are any deficiencies, please family members give Xiaoye some good suggestions, I will continue to optimize the article! Then let us work together! Hahahahaha

 

 

 

Guess you like

Origin blog.csdn.net/Yzl17841857589/article/details/130050024