C language - storage of data in memory_study notes

introduction

As mentioned in the article C Language - Binary/Shift Operator/Bit Operator_Study Notes , data is stored in binary form in memory, that is, 0 and 1;

而整数的二进制表示方法有三种,原码、反码和补码,文中也有所提及
而关于浮点数,浮点数在内存中也是存的是二进制,但是相关规则和整数的存储有很大不同

The following will introduce in detail how integers and floating point numbers are stored in memory.

Storage of integers in memory

There are three binary representation methods of integers, namely original code, complement code and complement code.

All three representation methods have sign bits and numerical bits. The sign bit uses 0 to represent "positive" and 1 to represent "negative". The highest bit of the numerical bit is used as the sign bit, and the rest are Numerical bits.

The original, inverse, and complement codes of positive integers are the same.

There are three ways to represent negative integers in different ways.

Original code : The original code is obtained by directly translating the numerical value into binary in the form of positive and negative numbers.
One's complement code : Keep the sign bit of the original code unchanged, and invert the other bits bit by bit to obtain the one's complement code.
Complement code : the complement code + 1 is the complement code.

For integers: the data stored in the memory actually stores the complement code.

Reason :
In computer systems, values ​​are always represented and stored in two's complement. The reason is that using the complement code, the sign bit and the numerical field can be processed uniformly; at the same time, addition and subtraction can also be processed uniformly (the CPU only has an adder). In addition, the complement code and the original code are converted to each other, and the operation process is the same. Additional hardware circuitry is required.

For example (in C language, in the VS2020 environment, the int type and binary related representations are as follows)
Insert image description here
For integers: the data stored in the memory actually stores the complement code.
Is this the case? Let’s verify in the compiler (VS2022, X64 environment)
by running the following code, open the debugging window, and see that the memory window
Insert image description here
data is stored in hexadecimal in the memory under this compiler environment. ,
the result of converting the above 6's complement into hexadecimal is
Insert image description here

The result seems to be exactly opposite to what the running result indicates. Why is this?
This involves a knowledge point about endianness, let’s read on
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓ ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

Endianness

What is endianness

In fact, when more than one byte of data is stored in memory, there is a problem of storage order. According to different storage orders, we are divided into big-endian storage and little-endian storage. The following are the specific concepts:

  • Big-endian (storage) mode: means that the low-order byte content of the data is stored at the high address of the memory, and the high-order byte content of the data is stored at the low address of the memory.
  • Little endian (storage) mode: means that the low-order byte content of the data is stored at the low address of the memory, and the high-order byte content of the data is stored at the high address of the memory.

Why are there big and small ends?

This is because in computer systems, we use bytes as units. Each address unit corresponds to a byte, and a byte is 8 bits. However, in the C language, in addition to the 8-bit char, there are also There are 16-bit short type and 32-bit long type (depending on the specific compiler). In addition, for processors with more than 8 bits, such as 16-bit or 32-bit processors, since the register width is greater than one byte, Then there must be a problem of how to arrange multiple bytes. This leads to big-endian storage mode and little-endian storage mode.

For example:

  • A 16-bit short type X, the address in the memory is 0x0010, and the value of
  • For big-endian mode, put 0x11 in the low address, that is, 0x0010, and 0x22 in the high address, that is, 0x0011.
  • Little endian mode is just the opposite.

Our commonly used X86 structure is little endian mode, while KEIL C51 is in big-endian mode. Many ARM and DSP are in little-endian mode. Some ARM processors can also select big-endian or little-endian mode by hardware.

How to determine the endianness of the current machine

We can design a small program to judge

int check_sys()
{
    
    
	int a = 1;
	return (*(char*)&a);//小端返回1,大端返回0
}

int main()
{
    
    
	if (check_sys() == 1)
	{
    
    
		printf("小端\n");
	}
	else
	{
    
    
		printf("大端\n");
	}
	return 0;
}

We first define an integer variable a=1. We know that a is stored in hexadecimal form in the memory as 00 00 00 01. Therefore, we get the address of a, start from the starting address of a, and take out The content of one byte can be judged. If it is little-endian, the result is 1; if it is big-endian, the result is 0.

Storage of floating point data in memory

The floating-point number family includes: float, double, long double types.

  • The float, double and long double types usually occupy 4 bytes, 8 bytes and 16 bytes of memory space respectively. However, this value is not fixed and may vary based on different operating systems, compilers, or hardware architectures.

Storage regulations for floating point data

According to the international standard IEEE (Institute of Electrical and Electronics Engineering) 754, any binary floating point number V can be expressed in the following form:
Insert image description here

IEEE754 regulations:

For a 32-bit floating point number , the highest 1 bit stores the sign bit S, the next 8 bits store the exponent E, and the remaining 23 bits store the significant digit M.
For a 64-bit floating point number , the highest 1 bit stores the sign bit S, the next 11 bits store the exponent E, and the remaining 52 bits store the significant digit M.

Insert image description hereIEEE 754 also has some special provisions for the significant digit M and the exponent E.

  • As mentioned before, 1<M<2, that is to say, M can be written in the form of 1.xxxxxx, where xxxxxx represents the decimal part.
  • IEEE754 stipulates that when M is stored inside the computer, the first digit of this number is always 1 by default, so it can be discarded and only the following xxxxxx parts are saved. For example, when saving 1.01, only 01 is saved, and when reading, the first 1 is added. The purpose of this is to save 1 significant figure. Taking a 32-bit floating point number as an example, there are only 23 bits left for M. After the first 1 is rounded off, 24 significant digits can be saved.

As for the index E, the situation is more complicated.
First, E is an unsigned integer (unsigned int)

  • This means that if E is 8 bits, its value range is 0~255; if E is 11 bits, its value range is 0~2047.
  • However, we know that E in scientific notation can be negative.
  • Therefore, IEEE 754 stipulates that the real value of E must be added to an intermediate number when stored in memory,For an 8-bit E, this intermediate number is 127; for an 11-bit E, this intermediate number is 1023.
  • For example, the E of 2^10 is 10, so when it is saved as a 32-bit floating point number, it must be saved as 10+127=137, which is 10001001.

for example

(As follows, define a float type variable f with a value of 5.0)

  • float f = 5.0
  • float f = -5.0

5.0 in decimal is 101.0 in binary, which is equivalent to 1.01×2^2. Then, according to the format of V above , we can get S=0, M=1.01, E=2.

-5.0 in decimal is -101.0 written in binary, which is equivalent to -1.01×2^2. Then, S=1, M=1.01, E=2.

Insert image description here
-5 just changes the first bit to 1, so I won’t go into details.

The process of fetching floating point numbers

Retrieving the index E from memory can be further divided into three situations:

  1. E is not all 0 or incomplete

At this time, the floating point number is represented by the following rules: subtract 127 (or 1023) from the calculated value of the exponent E to obtain the real value, and then add the first 1 before the significant digit M.


For example: the binary form of 0.5 is 0.1. Since the positive part must be 1, that is, the decimal point is moved to the right by 1 place, it is 1.0*2^(-1), and its exponent code is -1+127 (middle value) = 126 , expressed as 01111110, and the mantissa 1.0 removes the integer part to 0, and fills in 0 to 23 digits
to 1

  1. E is all 0

At this time, the exponent E of the floating point number is equal to 1-127 (or 1-1023), which is the real value. The effective number M no longer adds the first 1, but is reduced to a decimal of 0.xXXxxx. This is done to represent ±0, and very small numbers close to 0.

  1. E is all 1

At this time, if the significant digits M are all 0, it means ±infinity (positive or negative depends on the sign bit s).

Example practice application

int main()
{
    
    
    int n = 9;
    float* pFloat = (float*)&n;

    printf("n的值为:%d\n", n);
    printf("*pFloat的值为:%f\n", *pFloat);
   

    *pFloat = 9.0;
    
    printf("num的值为:%d\n", n);
    printf("*pFloat的值为:%f\n", *pFloat);//9.0
    return 0;
}

Running results:
Insert image description here
The explanation is as follows:
Insert image description here

Guess you like

Origin blog.csdn.net/yjagks/article/details/132907884