[Rust] Ownership

ownership

Ownership is the most unique feature of Rust, which allows Rust to ensure memory safety without GC (Garbage Collection). A core feature of Rust is ownership, and all programs must manage the way they use computer memory at runtime. Some languages ​​have a garbage collection mechanism that constantly looks for unused memory while the program is running. In other languages, the programmer must explicitly allocate and free memory.

Rust takes a third approach, memory is managed through an ownership system consisting of a set of rules that the compiler checks at compile time. The ownership feature does not slow down the program while it is running.

stack and heap

In a system-level programming language like Rust, whether a value is on the stack or on the heap has a greater impact on the behavior of the language and why you make certain decisions. While your code is running, both the stack and the heap are memory available to you, but they are structured very differently.

  • The stack is stored in the order in which the values ​​are received, and they are removed in the reverse order (last in, first out, LIFO). Adding data is called pushing onto the stack, and removing data is called popping the stack. Pushing a value onto the stack is not called allocation. Because the pointer is a fixed size, you can store the pointer on the stack.
  • All data stored on the stack must have a known fixed size. Data whose size is unknown at compile time or whose size may change at runtime must be stored on the heap.
  • Heap memory is poorly organized. When you put data into the heap, you will request a certain amount of space. The operating system finds a large enough space in the heap, marks it as in use, and returns a pointer, that is The address of this space. This process is called allocating on the heap, and is sometimes just called "allocating".
  • Pushing to the stack is faster than allocating memory on the heap because (when pushing) the allocator doesn't have to search for memory space to store new data; its location is always at the top of the stack. In contrast, allocating memory on the heap requires more work, because the allocator must first find a memory space sufficient to store data, and then make some records to prepare for the next allocation.
  • Accessing data in the heap is slower than accessing data in the stack, because pointers are needed to find the data in the heap. With modern processors, instructions are faster if they have fewer jumps around in memory due to cache.
  • If the data is stored closer together, the processing speed of the processor will be faster (on the stack). If the distance between the data is relatively long, the processing speed will be slower (on the heap). Allocating large amounts of space on the heap also takes time.
  • When your code calls a function, values ​​are passed into the function (and also a pointer to the heap). Variables local to the function are pushed onto the stack. When the function finishes, the values ​​are popped off the stack.

Ownership exists for a reason

Problems that ownership solves: Track which parts of the code are using which data on the heap; minimize the amount of duplicate data on the heap; clean up unused data on the heap to avoid running out of space. Once you understand ownership, you don't need to think about the stack or the heap as often, but knowing that managing heap data is why ownership exists helps explain why it works the way it does.

ownership rules

  • Each value has a variable which is the owner of the value.
  • Each value can have only one owner at a time.
  • The value will be deleted when the owner goes out of scope.

variable scope

Scope is the valid scope of an item in the program.

fn main() {
    
    
    //s 不可用
    let s = "hello";//s 可用
                    //可以对 s 进行相关操作
}//s 作用域到此结束,s 不再可用

String type

String is more complex than those basic scalar data types. String literals: Those string values ​​written by hand in the program, which are immutable. Rust also has a second string type: String. Allocated on the heap, capable of storing an amount of text unknown at compile time.

fn main() {
    
    
    let mut s = String::from("Hello");
    s.push_str(",World");
    println!("{}",s);
}

Why can values ​​of type String be modified, but string literals cannot be modified, because they handle memory differently.

memory and allocation

The string literal value, its content is known at compile time, and its text content is directly hard-coded into the final executable file, which is fast and efficient. because of its immutability.

In order to support mutability, the String type needs to allocate memory on the heap to save the unknown text content at compile time: the operating system must request memory at runtime, and this step is realized by calling String::from. When the String is used up, there needs to be some way to return the memory to the operating system. This step, in languages ​​that have a GC, tracks and cleans up unused memory. Without GC, we need to identify when the memory is no longer used, and call the code to return it. ―If you forget, it will waste memory; if you do it in advance, the variable will be illegal; if you do it twice, it is also a bug. An allocation must correspond to a deallocation.

Rust takes a different approach: for a value, memory is automatically returned to the operating system as soon as the variable that owns it goes out of scope. Rust calls a special function drop to free memory when a variable goes out of scope.

How variables interact with data

1.Move

Multiple variables can interact with the same data in unique ways.

let x = 5;
let y = x;

Integers are simple values ​​of known fixed size, and the two 5s are pushed onto the stack.

let s1 = String::from("hello");
let s2 = s1;

A String consists of 3 parts: a pointer to store the content of the string, a length, and a capacity. These are stored in the stack, and the part storing the string content is on the heap. The length len is the number of bytes required to store the string content. Capacity capacity refers to the total number of bytes of memory obtained by String from operating system devices.

insert image description here

When s1 is assigned to s2, a copy of String data is assigned, and a pointer, length, and capacity are copied on the stack, and the data on the heap pointed to by the pointer is not copied. When the variable leaves the scope, Rust will automatically call the drop function and release the heap memory used by the variable. When s1 and s2 leave the scope, they both try to release the same memory, which is the double free bug.

In order to ensure memory safety, Rust does not try to copy the allocated memory, but chooses to invalidate s1. When s1 goes out of scope, Rust does not need to release anything.

insert image description here

If you've heard the terms shallow copy and deep copy in other languages, then copying pointers, lengths, and capacities without copying the data might sound like shallow copying. But because Rust also invalidates the first variable, this operation is called a move, not a shallow copy. Implicit design principle: Rust does not automatically create deep copies of data, and any automatic assignment is cheap in terms of runtime performance.

2.Clone

If you really want to make a deep copy of the String data on the heap, not just the data on the stack, you can use the clone method.

let s1 = String::from("hello");
let s2 = s1.clone();
println!("s1 = {}, s2 = {}", s1, s2);

insert image description here

3.Copy

let x = 5;
let y = x;

This code seems to contradict what we just learned: clone is not called, but x is still valid and has not been moved into y.

The reason is that types like integers whose size is known at compile time are stored entirely on the stack, so copying their actual value is fast. This means that there is no reason to invalidate x after variable y has been created. In other words, there is no difference between shallow and deep copies, so calling clone here will not be any different from the usual shallow copy.

Rust provides the Copy trait, which can be used for types like integers that are completely stored on the stack. If a type implements the Copy trait, the old variable is still available after assignment; if a type or part of the type implements the Drop trait, then Rust does not allow it to implement the Copy trait.

Some types that have the Copy trait: any composite type that is simple scalar can be Copy, anything that needs to allocate memory or some kind of resource is not Copy.

  • All integer types, such as u32
  • bool
  • char
  • All floating point types, such as f64
  • Tuple (tuple), if all its fields are Copy

Ownership and Functions

Semantically, passing a value to a function is similar to assigning a value to a variable, and passing a value to a function will result in a move or copy.

fn main() {
    
    
    let mut s = String::from("Hello,World");

    take_ownership(s);//s 被移动 不再有效

    let x = 5;

    makes_copy(x);//复制

    println!("x:{}",x);
}

fn take_ownership(some_string: String){
    
    
    println!("{}",some_string);
}

fn makes_copy(some_number: i32){
    
    
    println!("{}",some_number);
}

return value and scope

The transfer of ownership also occurs when the function returns a value.

fn main() {
    
    
    let s1 = gives_ownship();gives_ownership 将返回值转移给s1

    let s2 = String::from("hello");

    let s3 = takes_and_give_back(s2);//s2 被移动到takes_and_gives_back 中,它也将返回值移给 s3
}

fn gives_ownship()->String{
    
    
    let some_string = String::from("hello");
    some_string
}

fn takes_and_give_back(a_string:String)->String{
    
    
    a_string
}

Ownership of a variable always follows the same pattern: a move occurs when a value is assigned to another variable. When a variable containing heap data goes out of scope, its value is cleared by the drop function, unless ownership of the data is moved to another variable.

How can I make a function use a value without taking ownership of it?

fn main() {
    
    
    let s1 = String::from("hello");

    let (s2, len) = calculate_length(s1);

    println!("The length of '{}' is {}.", s2, len);
}

fn calculate_length(s: String) -> (String, usize) {
    
    
    let length = s.len(); 

    (s, length)
}

But it's troublesome to pass in and out. Rust has a feature called references.

quote

The type of the parameter is &String instead of String, and the & symbol means reference: it allows you to refer to some value without taking ownership of it.

fn main() {
    
    
    let s1 = String::from("hello");

    let len = calculate_length(&s1);

    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    
    
    s.len()
}

insert image description here

borrow

We call the act of using references as function parameters as borrowing. Borrowed variables cannot be modified. Like variables, references are immutable by default.

mutable reference

fn main() {
    
    
    let mut s1 = String::from("Hello");

    let len = calculate_length(&mut s1);

    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &mut String) -> usize {
    
    
    s.push_str(",World");
    s.len()
} 

Mutable references have an important limitation: within a specific scope, there can only be one mutable reference to a piece of data. This has the advantage of preventing data races at compile time. Data races occur in the following three behaviors. Two or more pointers access a piece of data at the same time, at least one pointer is used to write data, and no mechanism is used to synchronize access to data. We can create new scopes to allow non-simultaneous creation of multiple mutable references.

    let mut s = String::from("hello");
    {
    
    
        let r1 = &mut s;
    }
    let r2 = &mut s;

Another restriction is that you cannot have a mutable reference and an immutable reference at the same time. Multiple immutable references are possible.

Dangling References

Dangling Pointer (Dangling Pointer): A pointer refers to an address in memory, and this memory may have been released and allocated for other people to use.

In Rust, the compiler guarantees that references are never dangling:
if you reference some data, the compiler will guarantee that the data doesn't go out of scope until the reference goes out of scope.

quoted rules

At any given moment, only one of the following conditions can be met:

  • a mutable reference
  • Any number of immutable references must always be valid

References must always be valid.

slice

Another non-ownership data type in Rust: slices.

Write a function that takes a string of words separated by spaces and returns the first word found in the string. If the function finds no spaces in the string, the entire string is one word, so the entire string should be returned.

fn main() {
    
    
    let mut s = String::from("Hello world");
    let wordIndex = first_word(&s);

    s.clear();
    println!("{}", wordIndex);
}

fn first_word(s: &String) -> usize {
    
    
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
    
    
        if item == b' ' {
    
    
            return i;
        }
    }
    s.len()
}

This program compiles without any errors, but the wordIndex is completely unrelated to the s state. After s is cleared, wordIndex still returns the value of the state when s is passed to the function. Rust provides a solution for this situation. String slices.

string slice

A string slice is a reference to a portion of a string. form:

[开始索引..结束索引]

The start index is the index value at the start of the slice, and the end index is all the values ​​at the end of the slice.

let s = String::from("Hello World");

let hello = &s[0..5];
let world = &s[6..11];

let hello2 = &s[..5];
let world2 = &s[6..];

insert image description here

Range indices for string slices must occur within valid UTF-8 character boundaries. If an attempt is made to create a string slice from a multibyte character, the program will report an error and exit.

Rewrite firstworld:


fn main() {
    
    
    let mut s = String::from("Hello World");

    let word = first_word(&s);

    //s.clear(); // 错误!
    println!("the first word is: {}", word);
}

fn first_word(s: &String) -> &str {
    
    
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
    
    
        if item == b' ' {
    
    
            return &s[0..i];
        }
    }
    &s[..]
}

String literals are slices, and string literals are stored directly in the binary program.

Pass a string slice as an argument

Experienced Rust developers will use &str as the parameter type, because it can receive parameters of String and &str types at the same time:

fn first_word(s: &str) -> &str {
    
    

Using a string slice calls the function directly, using String creates a complete String slice to call the function on.

Using string slices instead of string references when defining functions makes our API more general without losing any functionality.

fn main() {
    
    
    let mut s = String::from("hello world");

    let word = first_word(&s);

    let mut s2 = "hello world";

    let word2 = first_word(s2);
    //s.clear(); // 错误!
    println!("the first word of s is: {}", word);
    println!("the first word of s2 is: {}", word2);
}

fn first_word(s: &str) -> &str {
    
    
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
    
    
        if item == b' ' {
    
    
            return &s[0..i];
        }
    }

    &s[..]
}

Other types of slices

let a = [1, 2, 3, 4, 5];

let slice = &a[1..3];

It works the same way as a string slice, by storing a reference to the first collection element and a total length of the collection.

Guess you like

Origin blog.csdn.net/weixin_43912621/article/details/131430630