尝试过Rust后，突然发觉Java不香了！讲讲Rust内存安全性的问题

前言

上一篇文章提到过，内存安全性内容太多，单独拿出来一章给大家讲讲，喜欢的朋友可以持续关注博主！

内存安全性

Rust 最重要的特点就是可以提供内存安全保证，而且没有额外的性能损失。在传统的系统级编程语言（ C/C＋＋）的开发过程中，经常出现因各种内存错误引起的崩溃或bug ，比如空指针、野指针、内存泄漏、内存越界、段错误、数据竞争、迭代器失效等，血泪斑斑，数不胜数；内存问题是影响程序稳定性和安全性的重大隐患，并且是影响开发效率的重大因素；根据google和微软两大巨头的说法，旗下重要产品程序安全问题70%由内存问题引发，并且两个巨头都有利用Rust语言来解决内存安全问题的想法。Rust语言从设计之初就把解决内存安全作为一个重要目标，通过一系列手段保证内存安全，让不安全的潜在风险在编译阶段就暴露出来。接下来根据自己粗浅的理解，简单介绍Rust解决内存安全的手段有哪些。

1 所有权规则

1）Rust 中每一个值或者对象都有一个称之为其所有者（owner）的变量。

例如：

let obj = String::from("hello");

obj是String对象的所有权变量。

2）值或对象有且只能有一个所有者。

3）当所有者离开作用域，所有者所代表的对象或者值会被立即销毁。

4）赋值语句、函数调用、函数返回等会导致所有权转移，原有变量会失效。

例如：

fn main() { 
 
let s = String::from("hello"); 
 
let s1 = s; //所有权发生了转移，由s转移给s1 
 
print!("{}",s); //s无效，不能访问，此句编译会报错 
 
}

fn test(s1:String){ 
 
print!("{}",s1); 
 
} 
 
fn main() { 
 
let s = String::from("hello"); 
 
test(s); //传参，所有权发生了转移 
 
print!("{}",s); //此处s无效，编译报错 
 
}

Rust的所有权规则保证了同一时刻永远只有一个变量持有一个对象的所有权，避免数据竞争。

2 借用规则

可能大家都发现了问题，什么鬼，为什么我传了个参数s给test函数，这参数s后面还不能用了呢？如果我接下来要使用变量s怎么办？这时候就要用到Rust的借用特性。在Rust中，你拥有一个变量的所有权，如果想让其它变量或者函数访问，你可以把它“借”给其它变量或者你所调用的函数，供它们访问。Rust会在编译时检查所有借出的值，确保它们的寿命不会超过值本身的寿命。

例如，以下的写法就没有问题：

fn test(s1:&String){ 
 
print!("{}",s1); 
 
} 
 
fn main() { 
 
let s = String::from("hello"); 
 
test(&s); //传参，注意只是传递了引用，所有权还归属于s 
 
print!("{}",s); //此处s依然有效，可以访问 
 
}

fn main() { 
 
let s = String::from("hello"); 
 
let s1 = &s; //s1借用s，所有权还归属于s 
 
print!("{}",s); //此处s依然有效，可以访问 
 
print!("{}",s1); //此处s1和s指向同一个对象 
 
}

如果我们尝试修改借用的变量呢？

fn main() { 
 
let s = String::from("hello"); 
 
change(&s); 
 
} 
 
fn change(some_string: &String) { 
 
some_string.push_str(", world"); 
 
}

借用默认是不可变的，上面的代码编译时会报错：

error[E0596]: cannot borrow immutable borrowed content `*some_string` as mutable 
 
--> error.rs:8:5 
 
| 
 
7 | fn change(some_string: &String) { 
 
| ------- use `&mut String` here to make mutable 
 
8 | some_string.push_str(", world"); 
 
| ^^^^^^^^^^^ cannot borrow as mutable

根据编译错误的提示，通过mut关键字将默认借用修改为可变借用就OK，如下代码可以编译通过：

fn main() { 
 
let mut s = String::from("hello"); 
 
change(&mut s); 
 
} 
 
fn change(some_string: &mut String) { 
 
some_string.push_str(", world"); 
 
}

不过可变引用有一个很大的限制：在特定作用域中的特定数据有且只能有一个可变引用，这个限制的好处是 Rust 可以在编译时就避免数据竞争，这些代码会失败：

let mut s = String::from("hello"); 
 
let r1 = &mut s; 
 
let r2 = &mut s;

报错如下：

error[E0499]: cannot borrow `s` as mutable more than once at a time 
 
--> borrow_twice.rs:5:19 
 
| 
 
4 | let r1 = &mut s; 
 
| - first mutable borrow occurs here 
 
5 | let r2 = &mut s; 
 
| ^ second mutable borrow occurs here 
 
6 | } 
 
| - first borrow ends here

在存在指针的语言中，容易通过释放内存时保留指向它的指针而错误地生成一个悬垂指针（dangling pointer），所谓悬垂指针是其指向的内存可能已经被分配给其它持有者或者已经被释放。相比之下，在 Rust 中编译器确保引用永远也不会变成悬垂状态：当我们拥有一些数据的引用，编译器确保数据不会在其引用之前离开作用域。

让我们尝试创建一个悬垂引用，Rust 会通过一个编译时错误来避免：

fn main() { 
 
let reference_to_nothing = dangle(); 
 
} 
 
fn dangle() -> &String { 
 
let s = String::from("hello"); 
 
&s 
 
}

这里是编译错误：

error[E0106]: missing lifetime specifier 
 
--> dangle.rs:5:16 
 
| 
 
5 | fn dangle() -> &String { 
 
| ^ expected lifetime parameter 
 
| 
 
= help: this function's return type contains a borrowed value, but there is 
 
no value for it to be borrowed from 
 
= help: consider giving it a 'static lifetime

让我们简要的概括一下之前对引用的讨论，以下3条规则在编译时就会检查，违反任何一条，编译报错并给出提示。

1）在任意给定时间，只能拥有如下中的一个：

一个可变引用。
任意数量的不可变引用。

2）引用必须总是有效的。

3）引用的寿命不会超过值本身的寿命。

3 变量生命周期规则

生命周期检查的主要目标是避免悬垂引用，考虑以下示例中的程序，它有一个外部作用域和一个内部作用域，外部作用域声明了一个没有初值的变量 r，而内部作用域声明了一个初值为 5 的变量 x。在内部作用域中，我们尝试将 r 的值设置为一个 x 的引用。接着在内部作用域结束后，尝试打印出 r 的值：

error[E0106]: missing lifetime specifier 
 
--> dangle.rs:5:16 
 
| 
 
5 | fn dangle() -> &String { 
 
| ^ expected lifetime parameter 
 
| 
 
= help: this function's return type contains a borrowed value, but there is 
 
no value for it to be borrowed from 
 
= help: consider giving it a 'static lifetime

当编译这段代码时会得到一个错误：

error: `x` does not live long enough 
 
| 
 
6 | r = &x; 
 
| - borrow occurs here 
 
7 | } 
 
| ^ `x` dropped here while still borrowed 
 
... 
 
10 | } 
 
| - borrowed value needs to live until here

编译错误显示：变量 x 并没有 “活的足够久”，那么Rust是如何判断的呢？

编译器的这一部分叫做借用检查器（borrow checker），它比较作用域来确保所有的借用都是有效的。如下：r 和 x 的生命周期注解，分别叫做 'a 和 'b：

{ 
 
let r; // -------+-- 'a 
 
// | 
 
{ // | 
 
let x = 5; // -+-----+-- 'b 
 
r = &x; // | | 
 
} // -+ | 
 
// | 
 
println!("r: {}", r); // | 
 
} // -------+

我们将 r 的生命周期标记为 'a 并将 x 的生命周期标记为 'b。如你所见，内部的 'b 块要比外部的生命周期 'a 小得多。在编译时，Rust 比较这两个生命周期的大小，并发现 r 拥有生命周期 'a，不过它引用了一个拥有生命周期 'b 的对象。程序被拒绝编译，因为生命周期 'b 比生命周期 'a 要小：被引用的对象比它的引用者存在的时间更短。

关于借用生命周期检查，Rust还有一套复杂的生命周期标记规则，使Rust能在编译时就能发现可能存在的悬垂引用，具体链接见[5]。

4 多线程安全保证

内存破坏很多情况下是由数据竞争（data race）所引起，它可由这三个行为造成：

两个或更多指针同时访问同一数据。
至少有一个这样的指针被用来写入数据。
不存在同步数据访问的机制。

那么在多线程环境下，Rust是如何避免数据竞争的？

先从一个简单的例子说起，尝试在另一个线程使用主线程创建的 vector：

use std::thread; 
 
fn main() { 
 
let v = vec![1, 2, 3]; 
 
let handle = thread::spawn(|| { 
 
println!("Here's a vector: {:?}", v); 
 
}); 
 
handle.join().unwrap(); 
 
}

闭包使用了 v，所以闭包会捕获 v 并使其成为闭包环境的一部分。因为 thread::spawn 在一个新线程中运行这个闭包，所以可以在新线程中访问 v。然而当编译这个例子时，会得到如下错误：

error[E0373]: closure may outlive the current function, but it borrows `v`, 
 
which is owned by the current function 
 
--> src/main.rs:6:32 
 
| 
 
6 | let handle = thread::spawn(|| { 
 
| ^^ may outlive borrowed value `v` 
 
7 | println!("Here's a vector: {:?}", v); 
 
| - `v` is borrowed here 
 
| 
 
help: to force the closure to take ownership of `v` (and any other referenced 
 
variables), use the `move` keyword 
 
| 
 
6 | let handle = thread::spawn(move || { 
 
| ^^^^^^^

Rust 会“推断”如何捕获 v，因为 println! 只需要 v 的引用，闭包尝试借用 v。然而这有一个问题：Rust 不知道这个新建线程会执行多久，所以无法知晓 v 的引用是否一直有效。所以编译器提示：

closure may outlive the current function, but it borrows `v` 。

下面展示了一个 v 的引用很有可能不再有效的场景：

use std::thread; 
 
fn main() { 
 
let v = vec![1, 2, 3]; 
 
let handle = thread::spawn(|| { 
 
println!("Here's a vector: {:?}", v); 
 
}); 
 
drop(v); // 强制释放变量v 
 
handle.join().unwrap(); 
 
}

为了修复示上面的编译错误，我们可以听取编译器的建议:

help: to force the closure to take ownership of `v` (and any other referenced 
 
variables), use the `move` keyword 
 
| 
 
6 | let handle = thread::spawn(move || {

接下来是正确的写法：

use std::thread; 
 
fn main() { 
 
let v = vec![1, 2, 3]; 
 
let handle = thread::spawn(move || { //使用 move 关键字强制获取它使用的值的所有权，接下来就可以正常使用v了 
 
println!("Here's a vector: {:?}", v); 
 
}); 
 
handle.join().unwrap(); 
 
}

从上面简单例子中可以看出多线程间参数传递时，编译器会严格检查参数的生命周期，确保参数的有效性和可能存在的数据竞争。

大家注意到没有，上面的例子虽然能正确编译通过，但是有个问题，变量v的所有权已经转移到子线程中，main函数已经无法访问v，如何让main再次拥有v呢？如果用C++或者Golang等语言，你可以有很多种选择，比如全局变量，指针，引用之类的，但是Rust没有给你过多的选择，在Rust中，为了安全性考虑，全局变量为只读不允许修改，并且引用不能直接在多线程间传递。Rust 中一个实现消息传递并发的主要工具是通道（channel），这种做法是借鉴了Golang的通道，用法类似。

示例：

use std::thread; 
 
use std::sync::mpsc; 
 
fn main() { 
 
let (tx, rx) = mpsc::channel(); 
 
thread::spawn(move || { 
 
let val = String::from("hi"); 
 
tx.send(val).unwrap(); 
 
}); 
 
let received = rx.recv().unwrap(); 
 
println!("Got: {}", received); 
 
}

上例中，我们可以在main函数中通过channel得到了子线程中的对象val。

注意，tx.send(val).unwrap(); 之后，val的所有权已经发生了变化，接下来在子线程中不能再对val进行操作，否则会有编译错误，如下代码：

use std::thread; 
 
use std::sync::mpsc; 
 
fn main() { 
 
let (tx, rx) = mpsc::channel(); 
 
thread::spawn(move || { 
 
let val = String::from("hi"); 
 
tx.send(val).unwrap(); 
 
println!("val is {}", val);//在这里会发生编译错误 
 
}); 
 
let received = rx.recv().unwrap(); 
 
println!("Got: {}", received); 
 
}

这里尝试在通过 tx.send 发送 val 到通道中之后将其打印出来。允许这么做是一个坏主意：一旦将值发送到另一个线程后，那个线程可能会在我们再次使用它之前就将其修改或者丢弃。这会由于不一致或不存在的数据而导致错误或意外的结果。对于上面的代码，编译器给出错误：

error[E0382]: use of moved value: `val` 
 
--> src/main.rs:10:31 
 
| 
 
9 | tx.send(val).unwrap(); 
 
| --- value moved here 
 
10 | println!("val is {}", val); 
 
| ^^^ value used here after move 
 
| 
 
= note: move occurs because `val` has type `std::string::String`, which does 
 
not implement the `Copy` trait

我们通过channel能够实现多线程发送共享数据，但是依然有个问题：通道一旦将一个值或者对象send出去之后，我们将无法再使用这个值；如果面对这样一个需求：将一个计数器counter传给10条线程，每条线程对counter加1，最后在main函数中汇总打印出counter的值，这样一个简单的需求如果使用C++或者Golang或者其它非Rust语言实现，非常容易，一个全局变量，一把锁，几行代码轻松搞定，但是Rust语言可就没那么简单，如果你是一个新手，你可能会经历如下“艰难历程”：

首先很自然写出第一版：

use std::sync::Mutex; 
 
use std::thread; 
 
fn main() { 
 
let counter = Mutex::new(0); 
 
let mut handles = vec![]; 
 
for _ in 0..10 { 
 
let handle = thread::spawn(move || { 
 
let mut num = counter.lock().unwrap(); 
 
*num += 1; 
 
}); 
 
handles.push(handle); 
 
} 
 
for handle in handles { 
 
handle.join().unwrap(); 
 
} 
 
println!("Result: {}", *counter.lock().unwrap()); 
 
}

多线程有了，Mutex锁也有了，能保证每一次加一都是原子操作，代码看起来没什么问题，但是编译器会无情报错：

error[E0382]: capture of moved value: `counter` 
 
--> src/main.rs:10:27 
 
| 
 
9 | let handle = thread::spawn(move || { 
 
| ------- value moved (into closure) here 
 
10 | let mut num = counter.lock().unwrap(); 
 
| ^^^^^^^ value captured here after move 
 
| 
 
= note: move occurs because `counter` has type `std::sync::Mutex<i32>`, 
 
which does not implement the `Copy` trait 
 
error[E0382]: use of moved value: `counter` 
 
--> src/main.rs:21:29 
 
| 
 
9 | let handle = thread::spawn(move || { 
 
| ------- value moved (into closure) here 
 
... 
 
21 | println!("Result: {}", *counter.lock().unwrap()); 
 
| ^^^^^^^ value used here after move 
 
| 
 
= note: move occurs because `counter` has type `std::sync::Mutex<i32>`, 
 
which does not implement the `Copy` trait 
 
error: aborting due to 2 previous errors

错误信息表明 counter 值的所有权被move了，但是我们又去引用了，根据所有权规则，所有权转移之后不允许访问，但是为什么会发生？

让我们简化程序来进行分析。不同于在 for 循环中创建 10 个线程，仅仅创建两个线程来观察发生了什么。将示例中第一个 for 循环替换为如下代码：

let handle = thread::spawn(move || { 
 
let mut num = counter.lock().unwrap(); 
 
*num += 1; 
 
}); 
 
handles.push(handle); 
 
let handle2 = thread::spawn(move || { 
 
let mut num2 = counter.lock().unwrap(); 
 
*num2 += 1; 
 
}); 
 
handles.push(handle2);

这里创建了两个线程并将用于第二个线程的变量名改为 handle2 和 num2，编译会给出如下错误：

error[E0382]: capture of moved value: `counter` 
 
--> src/main.rs:16:24 
 
| 
 
8 | let handle = thread::spawn(move || { 
 
| ------- value moved (into closure) here 
 
... 
 
16 | let mut num2 = counter.lock().unwrap(); 
 
| ^^^^^^^ value captured here after move 
 
| 
 
= note: move occurs because `counter` has type `std::sync::Mutex<i32>`, 
 
which does not implement the `Copy` trait 
 
error[E0382]: use of moved value: `counter` 
 
--> src/main.rs:26:29 
 
| 
 
8 | let handle = thread::spawn(move || { 
 
| ------- value moved (into closure) here 
 
... 
 
26 | println!("Result: {}", *counter.lock().unwrap()); 
 
| ^^^^^^^ value used here after move 
 
| 
 
= note: move occurs because `counter` has type `std::sync::Mutex<i32>`, 
 
which does not implement the `Copy` trait 
 
error: aborting due to 2 previous errors

啊哈！第一个错误信息中说，counter 所有权被移动进了 handle 所代表线程的闭包中。因此我们无法在第二个线程中再次捕获 counter ， Rust 告诉我们不能将 counter 的所有权移动到多个线程中。所以错误原因明朗了，因为我们在循环中创建了多个线程，第一条线程获取了 counter 所有权后，后面的线程再也拿不到 counter 的所有权。如何让多条线程同时间接（注意，只能是间接）拥有一个对象的所有权，哦，对了，引用计数！

通过使用智能指针 Rc<T> 来创建引用计数的值，尝试使用 Rc<T> 来允许多个线程拥有 Mutex<T> 于是写了第二版：

use std::rc::Rc; 
 
use std::sync::Mutex; 
 
use std::thread; 
 
fn main() { 
 
let counter = Rc::new(Mutex::new(0)); 
 
let mut handles = vec![]; 
 
for _ in 0..10 { 
 
let counter = Rc::clone(&counter); 
 
let handle = thread::spawn(move || { 
 
let mut num = counter.lock().unwrap(); 
 
*num += 1; 
 
}); 
 
handles.push(handle); 
 
} 
 
for handle in handles { 
 
handle.join().unwrap(); 
 
} 
 
println!("Result: {}", *counter.lock().unwrap());

再一次编译并…出现了不同的错误！编译器真是教会了我们很多！

error[E0277]: the trait bound `std::rc::Rc<std::sync::Mutex<i32>>: 
 
std::marker::Send` is not satisfied in `[closure@src/main.rs:11:36: 
 
15:10 
 
counter:std::rc::Rc<std::sync::Mutex<i32>>]` 
 
--> src/main.rs:11:22 
 
| 
 
11 | let handle = thread::spawn(move || { 
 
| ^^^^^^^^^^^^^ `std::rc::Rc<std::sync::Mutex<i32>>` 
 
cannot be sent between threads safely 
 
| 
 
= help: within `[closure@src/main.rs:11:36: 15:10 
 
counter:std::rc::Rc<std::sync::Mutex<i32>>]`, the trait `std::marker::Send` is 
 
not implemented for `std::rc::Rc<std::sync::Mutex<i32>>` 
 
= note: required because it appears within the type 
 
`[closure@src/main.rs:11:36: 15:10 
 
counter:std::rc::Rc<std::sync::Mutex<i32>>]` 
 
= note: required by `std::thread::spawn`

编译错误信息中有关键的一句：

`std::rc::Rc<std::sync::Mutex<i32>>` cannot be sent between threads safely。

不幸的是，Rc<T> 并不能安全的在线程间共享。当 Rc<T> 管理引用计数时，它必须在每一个 clone 调用时增加计数，并在每一个克隆被丢弃时减少计数。Rc<T> 并没有使用任何并发原语，来确保改变计数的操作不会被其他线程打断。在计数出错时可能会导致诡异的 bug，比如可能会造成内存泄漏，或在使用结束之前就丢弃一个值。我们所需要的是一个完全类似 Rc<T>，又以一种线程安全的方式改变引用计数的类型。所幸 Arc<T> 正是这么一个类似 Rc<T> 并可以安全的用于并发环境的类型。字母 “a” 代表原子性（atomic），所以这是一个原子引用计数（atomically reference counted）类型。

于是改写了第三版：

use std::sync::{Mutex, Arc}; 
 
use std::thread; 
 
fn main() { 
 
let counter = Arc::new(Mutex::new(0)); 
 
let mut handles = vec![]; 
 
for _ in 0..10 { 
 
let counter = Arc::clone(&counter); 
 
let handle = thread::spawn(move || { 
 
let mut num = counter.lock().unwrap(); 
 
*num += 1; 
 
}); 
 
handles.push(handle); 
 
} 
 
for handle in handles { 
 
handle.join().unwrap(); 
 
} 
 
println!("Result: {}", *counter.lock().unwrap()); 
 
}

这次编译通过，并且打印出了正确的结果，最终，在严厉的编译器的逐步引导，“谆谆教诲”下，我们总算写出了正确的代码。

Rust编译器对多线程数据共享，多线程数据传递这种内存安全事故多发区进行了极其严苛的检查和限制，确保编译时就能发现潜在的内存安全问题。在多线程传递数据时，除了通过channel，你没有第二种选择；在多线程数据共享时，除了Arc+Mutex(如果多线程共享的只是int bool这类简单数据类型，你还可以使用原子操作) ，你同样没有别的选择。虽然 Rust极其缺乏灵活性，但是这同样是它的优点，因为编译器一直在逼着你写出正确的代码，极大减少了程序的维护成本。

以上是我对Rust内存安全保障手段的一些理解，Rust使用一些乍一看很奇怪的特性，非常清晰的定义了一个安全的边界，并在上面做以足够的检查，保证你的代码不会出问题。Rust做到了没有垃圾回收的内存安全，没有数据竞争的并发安全。同时一个新手Rust程序员刚入坑Rust时，大部分的时间都是在解决编译问题。一个新手C++程序员初期可能会写出很多不安全的代码，埋下很多坑，但是新手Rust不会，因为一个新手Rust写出的不安全代码在编译阶段就被拦截了，根本没有机会埋坑，Rust承诺编译通过的Rust程序不会存在内存安全问题（注意：如果通过unsafe关键字强制关闭安全检查，则依然有可能出现内存安全问题）。

最后灵魂一问收尾：

没有垃圾回收的内存安全，没有数据竞争的并发安全、资源消耗低而性能强劲、开发效率高并且跨平台性优良，这样的Rust香不香？要不要拥抱一个？