In-depth explanation - Rust ownership and memory management mechanism

1. Start with variables

fn main() {
    // 基本数据类型
    let a = 5;
    let b = a;
    // 指针
    let ptr_a = &a;
    let ptr_b = &b;
    println!("a value = {}", a);
    println!("b value = {}", b);
    println!("ptr_a value = {:p}", ptr_a);
    println!("ptr_b value = {:p}", ptr_b);
}

// 输出
a value = 5
b value = 5
ptr_a value = 0x8bdecff6b0
ptr_b value = 0x8bdecff6b4

In this example, we define the value of variable a as 5, and assign the value of variable a to variable b. Then we define ptr_a and ptr_b to take the addresses of variables a and b respectively, and then call the prinltn function to output the value of each variable.

Think about a question: If you don't use variables, is it possible to directly use the value 5 as a parameter and a string as a parameter in the output function?

Can. But in this case, only static output can be achieved, not dynamic output. If the user needs to input certain data, perform addition and subtraction operations, or perform character splicing, then variables need to be introduced to implement dynamic processing.

How are variables bound to certain data?

When the program is running, the data is saved in the memory. If we directly obtain the value 5 inside through the memory address 0x8bdecff6b0 , it will be very inconvenient. Therefore, we gave this memory address an alias, called variable a. Accessing variable a means accessing the value in the memory address, so that we can obtain the value 5, which is much more convenient. Sometimes we just want to know the memory address of the data, then we can get the address of the variable by taking the address character &. &a is the address of variable a, 0x8bdecff6b0, and this address is assigned to the variable ptr_a. Accessing ptr_a is accessing the address of variable  a . 0x8bdecff6b0 instead of accessing the value 5 in the memory address . To access the value, you can only access the value in the memory address by  decoding the address symbol * . Accessing *ptr_a means accessing the value in the address 0x8bdecff6b0 of variable a, which is the value 5.

In fact, this explanation is not rigorous at all, so we combine the concepts of stack and heap for further analysis.

2. Stack and Heap

The core purpose of the stack and heap is to provide memory space for the program to use when it is running.

——Rust Course

1. StackStack

All data in the stack must occupy a known and fixed size of memory space. Following the first-in-last-out principle, adding variable a to the stack is called pushing , and removing variable ptr_b from the stack is called popping . When the function main is executed, variables are pushed onto the stack in sequence. When the program ends, the variables are popped out of the stack in sequence, and the corresponding memory is released, thereby realizing memory recycling.

2. Heap

Contrary to the stack, the size of the data in the heap is unknown and may even change. To store data on the heap, you must apply to the operating system for a memory space that can accommodate the data, and the operating system returns a memory address. This process is also called allocating memory on the heap . Finally, the memory address is pushed onto the stack and bound to a variable .

3. Performance comparison between stack and heap 

stack heap
Write performance quick. The operating system does not need to allocate new space when pushing onto the stack, it only needs to put new data on the top of the stack. slow. You must first apply for memory space from the operating system before you can store data.
Read performance quick. Stack data is stored directly in the CPU cache and does not require access to memory. slow. Heap data can only be stored in memory. The stack must be accessed first and then the memory can be accessed through the pointer on the stack.

It can be seen that processing data on the stack is the most efficient.

4. Copy

The data copy on the stack is called shallow copy (Cpoy), and the data copy on the heap is called deep copy (Clone).

3. Ownership and stack

 All programs must deal with computer memory. How to apply for space in memory to store the running content of the program, and how to release this space when it is not needed, has become a top priority and one of the difficulties in the design of all programming languages. As computer languages ​​continue to evolve, three schools of thought have emerged:

  • Garbage collection mechanism (GC) continuously looks for unused memory when the program is running. Typical representatives: Java, Go
  • Manually manage memory allocation and release . In the program, memory is applied for and released through function calls. Typical representative: C++
  • Memory is managed through ownership , which the compiler checks against a set of rules at compile time.

Among them, Rust chose the third one. The best thing is that this check only occurs during compilation, so there will be no performance loss during program running.

——Rust Course

After understanding the principles of variables and stacks, we officially enter today's topic - Rust ownership and memory management mechanism. Let's look at this piece of code:

fn main() {
    // 复合数据类型
    let x = String::from("hello");
    let y = String::from("world");
    let ptr_x = &x;
    let ptr_y = &y;
    println!("x value = {}", x);
    println!("y value = {}", y);
    println!("ptr_x value = {:p}", ptr_x);
    println!("ptr_y value = {:p}", ptr_y);
}

// 输出
x value = hello
y value = world
ptr_x value = 0x3cddcffa78
ptr_y value = 0x3cddcffa90

According to the stack knowledge, memory space is allocated from the heap to store the hello string, and the memory address 0x9bdecff001 is returned. The address is saved by the variable x, and the same is true for the other character world. At this point the program can run normally. Let's change the code slightly and assign the data of x to the variable y:

6 |     let x = String::from("hello");
  |         - move occurs because `x` has type `String`, which does not implement the `Copy` trait
7 |     let y = x;
  |             - value moved here
8 |     println!("x value = {}", x);
  |                              ^ value borrowed here after move

After assigning the value of variable x to variable y, an error occurs when outputting the value of variable x, indicating that ownership transfer occurred in line 7 of the code because the String type variable x does not implement the Copy method. Let's ignore the Copy method and first explain why the ownership transfer occurs.

The Rust Course introduces the principles of ownership:

Keep in mind the following ownership principles:

  1. Every value in Rust is owned by a variable, which is called the owner of the value
  2. A value can only be owned by one variable at the same time, or a value can only have one owner.
  3. When the owner (variable) leaves the scope, the value will be dropped (drop)

——Rust Course

Therefore, the stack should look like this,

1. First, memory space is allocated from the heap to store the hello string, and the memory address 0x9bdecff001 is returned , and the address is saved in the variable x;

2. After executing let y = x;, variable y is also bound to the data memory address  0x9bdecff001 in variable x ;

3. The variable x no longer points to the data memory address  0x9bdecff001 , and the ownership is transferred to the variable y. Therefore, when the variable x is accessed again, the compilation error is reported because the data memory address is invalid.

Think about it: Why does variable x transfer ownership to variable y? Isn’t it okay to clone the variable y (hello) on the heap?

As mentioned earlier, processing data (Copy) on the stack is more efficient than on the heap. Therefore, it can be considered that copying (Clone) a hello is not efficient enough and wastes memory; when recycling variable memory, it is done by popping the stack. Release one by one, if the variables x and y both point to the same memory address, it will cause a memory address to be released twice, which leads to memory pollution.

4. Rust ownership and memory management mechanism

OK, now let us sort out the above content:

1. Rust manages memory through ownership. A value can only be owned by one variable binding.

// hello 与 x 绑定
let x = String::from("hello");
// x 将 hello 所有权转移给 y ,hello 与 与绑定
let y = x;

2. Rust variables are pushed onto the stack in sequence during execution, and when the program ends, they are popped off the stack in sequence to release memory.

// 1、x 进栈 4、x 出栈,释放 x
let x = String::from("hello");
// 2、y 进栈 3、y 出栈,释放 y
let y = x;

3. Rust basic types are copied on the stack, and no ownership transfer occurs.

// 1 与 x 绑定
let x = 1;
// 栈上 Copy 一份 1 与 y 绑定,此处没有所有权转移
let y = x;

4. Rust composite types cannot be copied on the stack by default, and ownership transfer will occur.

// hello 与 x 绑定
let x = String::from("hello");
// x 将 hello 所有权转移给 y ,hello 与 与绑定
let y = x;

5. Rust composite types can be cloned on the heap and then pushed onto the stack without ownership transfer.

// hello 与 x 绑定
let x = String::from("hello");
// x 将 hello 在堆上 Clone 一份,返回另一份 hello 的内存地址 y 与绑定,此处没有发生所有权转移
let y = x.clone();

6. Function value transfer and return also adhere to the ownership and stack principles.

fn main() {
    // hello 与 x 绑定
    let x = String::from("hello");
    // 调用函数时,变量、参数同样也是进栈操作,结束调用时依照出栈次序释放内存
    // 因此,x 是复合类型,不能在栈上做复制,只能将 hello 所有权转移到函数参数 s 
    // 函数返回又将 hello 所有权转移给 y
    let y = move_ownership(x);
    // 后续不能再调用 x

    // 基本类型在调用函数时,变量、参数进栈时都是 Copy ,因此不会发生所有权转移
    let a = 1;
    let b = plus_one(a);
    // 后续可以再调用 a
}

fn move_ownership(s: String) -> String {
    s
}

fn plus_one(x: i32) -> i32 {
    x + 1
}

5. Borrowing and citation

ok, above we have sorted out Rust's ownership and memory management mechanism, which runs through the entire Rust syntax, but we always move the ownership of a certain variable around, which is very troublesome. Is there a mechanism that can borrow a certain variable? What about ownership of a variable instead of transferring ownership directly? Yes, this is the Rust borrowing and referencing mechanism.

fn main() {
    // 从堆上分配内存存储 hello 并返回地址指针给 x
    // 实际上 x 为一个结构体,结构体字段为 ptr 指针、len 长度 、cap 容量
    // 变量绑定后,x 的属性为 ptr 指向 hello ,len 为 5 ,cap 为 5
    let x = String::from("hello");
    // 变量 y 借用了 变量 x 的所有属性
    let y = &x;
    // 输出变量 y 的内存地址,y 借用 x 的 len 字段属性
    println!("{:p} {:?}", &y, y.len());
    // 输出变量 x 的内存地址,x 自身持有的 len 字段属性
    println!("{:p} {:?}", &x, x.len());
}

// 输出
0x8a94f9f4f0 5
0x8a94f9f4d8 5

Accessing variable y actually accesses the address of variable x, so that the field of x can be borrowed; and *y is equivalent to reading the value in variable x, that is, the hello pointed to by ptr.

When the program ends, the memory is released in the order of popping the stack, and the variable y, hello, and x are recycled.

Guess you like

Origin blog.csdn.net/weixin_47560078/article/details/128049578
Recommended