stack vs heap: Is it faster to allocate memory from the stack area or to allocate memory from the heap area?

e9b830ca539e254e7cf329d1a477ac9f.gif

Author | Code Farmer's Deserted Island Survival

Source | Code Farmer's Deserted Island Survival

A partner asked whether it is faster to allocate memory from the stack or from the heap. This is a relatively basic question. Let’s talk about it today.

Memory allocation and release of stack area

There is no doubt that it is obviously faster to allocate memory from the stack, because allocating memory from the stack is just a move of the stack pointer. What does this mean? What is "move the stack pointer"? Taking the x86 platform as an example, how is memory allocated on the stack implemented? Very simple, just one line of instruction:

sub $0x40,%rsp

This line of code is called "moving the stack pointer", and its essence is this picture:

c255416c8b6db8a468623ccb0491a686.png

Very simple, the register esp stores the top address of the current stack. Since the growth direction of the stack is from high address to low address, when increasing the stack, the stack pointer needs to be moved down, that is, the function of the sub instruction. This The instruction moves the top-of-stack pointer down by 64 bytes (0x40), so it can be said that 64 bytes are allocated on the stack.

As you can see, allocating memory on the stack is actually very, very simple, as simple as only one machine instruction .

The memory release of the stack area is also very simple, and only one machine instruction is required:

leave

The function of the leave instruction is to assign the stack base address to esp, so that the stack pointer points to the top of the stack of the previous stack frame, and then pops out ebp, so that ebp points to the bottom of the stack of the previous stack frame:

b581da87643753435d063da9cd330ec2.png

See, ebp and esp point to the previous stack frame after executing the leave instruction, which is equivalent to popping the stack frame, pop, so the memory occupied by stack 1 is invalid and useless, obviously this is what we are It is often said that the memory is reclaimed, so a simple leave instruction can reclaim the memory in the stack area .

ca545d7378981d8ce3059979b59fffa0.png

Next we see the memory application and release of the heap area.

Heap memory allocation and release

Heap memory allocation is opposite to stack allocation. How complicated is heap allocation?

Applying and releasing memory on the heap area is a relatively complicated process, because the heap itself needs to be managed by the programmer (the implementer of the memory allocator), while the stack is maintained by the compiler, and the maintenance of the heap area also involves memory. Allocation and release, but the memory allocation and release here is obviously not as simple as the stack area. In a word, here is the allocation and release of memory on demand . The essence is that each piece of allocated memory in the heap area has a life cycle. No , it's up to the programmer, I tend to think of dynamic memory allocation release as going to the parking lot to find a parking space.

50bd83f085e183ca4b7f0fd28d26ac7a.png

This obviously complicates the problem. We must carefully maintain which memory has been allocated and which is free, how to find a free memory, how to reclaim the memory block that the programmer does not need, and not have serious problems. There is no need to care about the memory fragmentation problem of stack area allocation and release memory. At the same time, when the heap area memory space is insufficient, the heap area needs to be expanded, etc., which makes it more complicated to apply for memory in the heap area than to allocate memory in the stack area. many.

Having said all that, how much slower is allocating memory on the heap than on the stack?

Next, let's write a piece of code to experiment.

code

void test_on_stack() {
  int a = 10;
}

void test_on_heap() {
  int* a = (int*)malloc(sizeof(int));
  *a = 10;
  free(a);
}

void test() {
  auto begin = GetTimeStampInUs();
  for (int i = 0; i < 100000000; ++i) {
    test_on_stack();
  }
  cout<<"test on stack "<<((GetTimeStampInUs() - begin) / 1000000.0)<<endl;

  begin = GetTimeStampInUs();
  for (int i = 0; i < 100000000; ++i) {
    test_on_heap();
  }
  cout<<"test on heap "<<((GetTimeStampInUs() - begin) / 1000000.0)<<endl;
}

This code is very simple, here are two functions:

  • A local variable is defined in the test_on_stack function, which is to apply for an integer-sized memory space from the stack

  • The test_on_heap function requests an integer-sized memory space from the heap

Then we call these two functions separately in the test function, each of which is called 100 million times, and record the time to run. The test results obtained are:

test on stack 0.191008
test on heap 20.0215

It can be seen that the total time spent on the stack is only about 0.2s, while the time spent on the heap allocation is 20s, a difference of 100 times.

It is worth noting that the compilation optimization is not turned on when compiling the program here. The time-consuming after compiling optimization is turned on is as follows:

test on stack 0.033521
test on heap 0.039294

As you can see, it's almost the same, but why? Obviously, it is inferred from common sense that it is faster to allocate on the stack, so what is the problem?

Now that we have enabled compilation optimization, does the optimized code run faster? Let's take a look at the instructions generated after compilation and optimization:

test_on_stackv:
  400f85:       55                      push   %rbp
  400f86:       48 89 e5                mov    %rsp,%rbp
  400f89:       5d                      pop    %rbp
  400f8a:       c3                      retq

test_on_heapv:
  400f8b:       55                      push   %rbp
  400f8c:       48 89 e5                mov    %rsp,%rbp
  400f8f:       5d                      pop    %rbp
  400f90:       c3                      retq

Aha, the compiler is so smart, it obviously notices that the code in these two functions doesn't actually do anything, even though we specifically assign a value of 10 to the variable a, then we don't use the variable a at all , so the compiler generates an empty function for us, and the above machine instructions actually correspond to an empty function.

Brother Xiaofeng has repeatedly added code here without deceiving the compiler. I tried to increase the complexity of the assignment of variable a, but the compiler still generated an empty function very cleverly. Anyway, I didn’t try it out . Smart enough , the generated machine instructions are very efficient. How to write a better benchmark, so that we can see the comparison of the two memory allocation methods when the compilation optimization is turned on. Any comments on this are welcome. Experience or students who have experience in compilation optimization leave a message.

Finally, let's take a look at the positioning of these two memory allocation methods.

Difference between stack memory and heap memory

First of all, we must realize that the stack is a first-in-last-out structure, and the stack area will increase as the function call level increases, and decrease as the function call is completed, so the stack does not need any "management"; and At the same time, due to the nature of the stack, the life cycle of the memory applied on the stack is bound to the function. When the function call is completed, the stack frame memory it occupies will be invalid, and the size of the stack is limited. You cannot Allocate too much memory on the stack, like this C code:

void test() {
  int b[10000000];
  b[1000000] = 10;
}

After this code runs, the core will drop, because the size of the stack area is very limited. Allocating a large piece of data on the stack will make the stack burst. This is the so-called Stack Overflow:

ed82c926ffb386c95346ad025cd2b2be.png

Forehead. . . Sorry, the picture is wrong, it should be this Stack Overflow:

d0034df38bb33c65d0d018091719c029.png

Sorry, I misplaced it again, but you know.

The heap is different. The life cycle of the memory allocated on the heap is controlled by the programmer. The programmer decides when to apply for memory and when to release the memory. Therefore, the heap must be managed, and the heap area is very broad. When the heap space is insufficient, the operating system will request to expand the heap to obtain more address space.

Of course, while the heap area gives the programmer more flexibility, the programmer needs to ensure that the memory is released when it is not used, otherwise there will be memory leaks, and there is no such problem when applying for memory on the stack.

Summarize

The stack area is automatically managed, and the heap area is managed manually. Obviously, it is faster to allocate memory on the stack area than on the heap area. When the memory usage scenarios applied for on the stack area are limited, the programmer needs to pay more attention when applying for memory. Much depends on the heap area, but I personally prefer to use the stack area memory when the memory requested by the stack area meets the requirements.

I hope this article will help you understand the stack area.

96fb68ef208e29bffdeae67f21c0b407.gif

Recommended in the past

How to solve Redis cache breakdown (invalidation), cache penetration, and cache avalanche?

If you are asked about distributed locks, how should you answer?

How does Redis with outstanding performance use epoll?

Basic knowledge of Java: what is a "bridge method"?

b58a8b77c8979c9a563777c1460c0aa2.gif

point to share

b25c6eabd61bd9e7cff7dab11779a49e.gif

Favorites

64f37e91bbf546b026f5940d79f4fd47.gif

Like

b6047c6ef8875a1bc48d50ee4e3b391c.gif

click to watch

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324126791&siteId=291194637