[Learn Rust Project Combat Together] Command line IO project minigrep - refactoring optimization module and error handling


foreword

After the first two sections, our minigrep can successfully open the specified text file and read its contents.

Considering that more functions will be added to our program later, some procedural problems will appear. For example, we have been using expect to output error messages, but we cannot know how the error occurred. There are many reasons for the error, such as the file does not exist. , or lack of permissions, and other problems, we need to refactor the project to optimize the project's modules and error handling.


1. Task purpose

At present, there may be four problems in the project, which will affect the follow-up procedures.

  1. Main now just parses the parameters and opens the file, which is no problem for a small function, but as the software functions continue to grow, the function becomes more complicated and difficult to debug and modify, and it is not conducive to Read, so it is necessary to separate out multiple functions, each function is responsible for a function.
  2. query and filename are configuration variables in the program, and contents are used to execute program logic. As the main function becomes complex, there will be more variables, and it will become difficult to understand the meaning of each variable. Therefore organize configuration variables into a structure that makes clear the purpose of the variable.
  3. If the file fails to open, it always prompts Something went wrong reading the file, but there are many situations where the file fails to open, such as the file does not exist, there is no file permission, and so on. Therefore, we should try to give detailed error information.
  4. We know that the program has two parameters, but if others don't know to pass two parameters, Rust will report an error, and our program is not robust enough. Consider putting error handling together and optimizing error prompts.

To do this, we need to refactor our project.

2. Project split

Large project splits in the Rust community have a common principle,

  1. Split the program into main.rs and lib.rs and put the logic of the program in lib.rs.
  2. When the command line parsing logic is relatively small, it can be kept in main.rs.
  3. When command line parsing starts to get complicated, it's also extracted from main.rs into lib.rs.

After the above steps, the function of main should be,

  • Call command-line parsing logic with parameter values
  • set any other configuration
  • Call the run function in lib.rs
  • If run returns an error, handle the error

The purpose of the above is to realize that main and rs specialize in processing program operation, and lib.rs handles the logic of functions.

3. Refactor the project

Next, we will follow the above principles to split the project.

Extract parameter parser

Create a new function parse_config, which is specially used to split the obtained parameters and return query and filename

fn parse_config(args: &[String]) -> (&str, &str) {
    
    
    let query = &args[1];
    let filename = &args[2];

    (query, filename)
}

Next, we use parse_config in the main function to obtain the parameters required by the program. The two lines of code commented below are used to obtain query and filename before. We commented it out, and then added the above to obtain the parameters let (query, filename) = parse_config(&args);.

fn main() {
    
    
    let args: Vec<String> = env::args().collect();
    println!("{:#?}", args);
    let (query, filename) = parse_config(&args);
    // let query = &args[1];
    // let filename = &args[2];
// 其他代码
}

This method seems a bit overkill now, but it is not, this will give you great convenience when locating the problem.

Export standalone configuration

Next, continue to improve parse_config. This function returns a tuple type of data. In order to correctly abstract the parameters and bring convenience to maintenance, we extract the return value of the parameter to make it visible.

Create a new structure Config, the fields inside are our parameters,

struct Config {
    
    
    query: String,
    filename: String,
}

Then modify the parse_config function,

fn parse_config(args: &[String]) -> Config {
    
    
    let query = args[1].clone();
    let filename = args[2].clone();

    Config {
    
     query, filename }
}

Here we use the clone method to copy the complete data of the parameters, so that the Config instance is independent, making the code more straightforward because there is no need to manage the life cycle of the reference, but it will consume more time and memory than storing the reference of string data. In this case the trade-off of simplicity is worth sacrificing a small amount of performance.

Next, we will transform Config again. When we use the standard library, we will use new to create instances. In order to conform to our programming habits, we will write a constructor for it. First, we will rename the parse_config function to new, and then move it into impl.

impl Config {
    
    
    fn new(args: &[String]) -> Config {
    
    
        let query = args[1].clone();
        let filename = args[2].clone();

        Config {
    
     query, filename }
    }
}

Then modify the call in main, so that we can use config to add points to call in the future.

// 其他代码
	let config = Config::new(&args);
    // let query = &args[1];
    // let filename = &args[2];
// 其他代码

Optimize error handling

When the number of parameters received by the program is not equal to 2, our program will report an error, and the error message is

index out of bounds: the len is 1 but the index is 1

This kind of error is a program error, which is incomprehensible to a user. For this reason, we judge the number of parameters when reading parameters, and optimize the errors here, so that people can intuitively see what errors are.

The number of parameters is judged in the new of Config,

    // 其他代码
    fn new(args: &[String]) -> Config {
    
    
        if args.len() < 3 {
    
    
            panic!("参数个数不足");
        }
        // 其他代码

If the panic is returned here, the program will exit directly. This kind of error message is indeed obvious, but it is not the best, because it will also output some debugging information, which is not friendly enough for users, so we consider using Result , we make the following changes to the impl

impl Config {
    
    
    fn new(args: &[String]) -> Result<Config, &'static str> {
    
    
        if args.len() < 3 {
    
    
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let filename = args[2].clone();

        Ok(Config {
    
     query, filename })
    }
}

Now modify the main function

use std::process;
fn main() {
    
    
    let args: Vec<String> = env::args().collect();
    let config = Config::new(&args).unwrap_or_else(|err| {
    
    
        println!("参数拆分错误: {}", err);
        process::exit(1);
    });

    // 其他代码

Now that we test for errors, this error message is very specific.

unwrap_or_else, which is defined on Result<T, E> of the standard library. Use unwrap_or_else to do some custom non-panic! error handling. When Result is Ok, this method behaves like unwrap: it returns the value wrapped inside Ok. However, when its value is Err, the method calls a closure, which is an anonymous function that we defined and passed as a parameter to unwrap_or_else.

We use the process in the standard library to handle the exit of the program, import std::process, and then call process::exit, and the incoming status code will stop the program immediately and the number passed to it will be used as the exit status code.

extract read file

We will extract the part of reading the file and become a function run, pass in the corresponding config, and read the file

fn run(config: Config) {
    
    
    let contents = fs::read_to_string(config.filename)
        .expect("读取文件失败");

    println!("文件内容:\n{}", contents);
}

In order to make the error prompt more friendly, continue to modify the run to return Result.

Here we make three obvious modifications. First, change the return type of the run function to Result<(), Box<dyn Error>>. Previously this function returned the unit type (), now it still remains as the return value when Ok.

For error types, the trait object is used Box<dyn Error>, which will be explained later. All you need to know Box<dyn Error>is that the function will return a type that implements the Error trait, but you don't need to specify the type of the value that will be returned.

The second change is to remove the call to expect and replace it with ?. Instead of panic! on an error, ? returns the error value from the function and lets the caller handle it.

The third modification is that the function now returns an Ok value on success.

fn run(config: Config) -> Result<(), Box<dyn Error>> {
    
    
    let contents = fs::read_to_string(config.filename)?;
    println!("文件内容:\n{}", contents);
    Ok(())
}

Then handle this error in main, because here is only concerned with the error, so use if let to handle it.

fn main() {
    
    
    // 其他代码
    if let Err(e) = run(config) {
    
    
        println!("程序运行出错: {}", e);
        process::exit(1);
    }
}

Split the code into crates

Create a new file lib.rs, move Config and run in main to src/lib.rs,

注意The functions and structures in lib.rs must be modified with the pub keyword

use std::fs;
use std::error::Error;

pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
    
    
    let contents = fs::read_to_string(config.filename)?;
    println!("文件内容:\n{}", contents);
    Ok(())
}
pub struct Config {
    
    
    query: String,
    filename: String,
}

impl Config {
    
    
    pub fn new(args: &[String]) -> Result<Config, &'static str> {
    
    
        if args.len() < 3 {
    
    
            return Err("参数个数不足");
        }
        let query = args[1].clone();
        let filename = args[2].clone();

        Ok(Config {
    
     query, filename })
    }
}

Then modify main.rs, mainly adding use minigrep::Config, so that when using it, you can use minigrep to call run in lib.rs, and you can also directly call Config in it.

use std::{
    
    env, process};
use minigrep::Config;
fn main() {
    
    
    let args: Vec<String> = env::args().collect();
    println!("{:#?}", args);
    let config = Config::new(&args).unwrap_or_else(|err| {
    
    
        println!("参数拆分错误: {}", err);
        process::exit(1);
    });
    if let Err(e) = minigrep::run(config) {
    
    
        println!("程序运行出错: {}", e);
        process::exit(1);
    }
}

Summarize

Through this section, you have learned how to split projects, how to output errors gracefully, and split projects into crates. Although the workload in this section is heavy, the benefits for subsequent development are also very large, laying the foundation for future success.

full code

main.rs

use std::{
    
    env, process};
use minigrep::Config;
fn main() {
    
    
    let args: Vec<String> = env::args().collect();
    println!("{:#?}", args);
    let config = Config::new(&args).unwrap_or_else(|err| {
    
    
        println!("参数拆分错误: {}", err);
        process::exit(1);
    });
    if let Err(e) = minigrep::run(config) {
    
    
        println!("程序运行出错: {}", e);
        process::exit(1);
    }
}

lib.rs

use std::fs;
use std::error::Error;

pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
    
    
    let contents = fs::read_to_string(config.filename)?;
    println!("文件内容:\n{}", contents);
    Ok(())
}
pub struct Config {
    
    
    query: String,
    filename: String,
}

impl Config {
    
    
    pub fn new(args: &[String]) -> Result<Config, &'static str> {
    
    
        if args.len() < 3 {
    
    
            return Err("参数个数不足");
        }
        let query = args[1].clone();
        let filename = args[2].clone();

        Ok(Config {
    
     query, filename })
    }
}

Guess you like

Origin blog.csdn.net/weixin_47754149/article/details/125730175