Jianzhi offer(C++)-JZ67: Convert string to integer atoi (algorithm-simulation)

Author: Zhai Tianbao Steven
Copyright Statement: The copyright belongs to the author. For commercial reprinting, please contact the author for authorization. For non-commercial reprinting, please indicate the source.

Topic description:

Write a function StrToInt to convert a string into an integer. You cannot use atoi or other similar library functions. The incoming string may consist of the following parts:

1. Several spaces

2. (Optional) A sign character ('+' or '-')

3. String expression composed of numbers, letters, symbols, and spaces

4. Some spaces

The conversion algorithm is as follows:
1. Remove useless leading spaces
2. When the first non-empty character is a + or - sign, it is used as the sign of the integer. If there is no sign, it defaults to a positive number
3. Determine the valid part of the integer:
3.1 After determining the sign bit, combine it with as many consecutive numbers as possible to become a valid integer number. If there is no valid integer part, then directly return 0. 3.2 Take
out the integer part in front of the string, there may be redundant ones later. Characters (letters, symbols, spaces, etc.), these characters can be ignored, they should not have an impact on the function 3.3 The integer
exceeds the 32-bit signed integer range [−2 31 , 2 31  − 1], the integer needs to be truncated to make it Stay within this range. Specifically, integers less than −2 31 should be adjusted to −2 31  and integers greater than 2 31  − 1 should be adjusted to 2 31  − 1
4. Remove useless trailing spaces

data range:

1.0 <= string length <= 100

2. The string consists of English letters (uppercase and lowercase), numbers (0-9), ' ', '+', '-' and '.'

Example:

enter:

"4396 clearlove"

return value:

4396

illustrate:

The characters after 6 do not belong to the valid integer part and are removed, but the valid part extracted earlier is returned.

Problem-solving ideas:

This question examines algorithmic scenario simulation. Two ways to solve the problem.

1) Traversal method

       First, filter the leading spaces; then judge the positive and negative signs; then judge the consecutive numbers, paying attention to the positive and negative limit judgment during the process; every time you find a new number, add the previous number * 10, and you can get the answer after traversing. Complexity O(n).

2) State machine

       Analyze the state of the string traversal process based on the state transition matrix.

       The status is divided into 4 types, spaces, symbols, numbers and invalid, corresponding to 0123. The matrix is ​​set up according to the question conditions as follows:

\begin{bmatrix} 0 & 1 & 2 & 3\\ 3 & 3& 2 & 3\\ 3& 3& 2 & 3 \end{bmatrix}

  1. The starting state is 0, and the first line is analyzed: if a space is encountered, the next state is still 0; if a symbol is encountered, the state changes to 1; if a number is encountered, the state changes to 2; if an invalid character is encountered , the status changes to 3.
  2. Assume that the status changes to 1, analyze the second line: if a space is encountered, that is, + space, it is invalid, so the first column of the second line is 3; if a symbol is encountered again, such as +-, it is also invalid, so the second line The second column is 3; if a number is encountered, such as -3, the status changes to 2; if an invalid character is encountered, the status changes to 3.
  3. Assume that the status changes to 2, analyze the third line: if a space is encountered, such as +8 space or 8 space, the subsequent steps will be invalid, so the first column of the third line is 3; if a symbol is encountered, such as +8+ or 8+ , the subsequent ones are also invalid, so the second column of the third line is 3; if a number is encountered, such as +89 or 89, the subsequent ones are valid, so the third column of the third line is 2; invalid characters are invalid in the same way.
  4. When the status is 2, the numbers are accumulated and out-of-bounds judgment is performed; when the status is 3, just break to exit.

       In general, the state machine represents the possible situations and state transitions in matrix form based on the requirements of the question, and then solves the problem. Complexity O(n).

Test code:

1) Sorting method

#include <climits>
class Solution {
public:
    // 字符串转为整数
    int StrToInt(string s) {
        int sign = 1;
        int idx = 0;
        int size = int(s.size());
        // 前空格过滤,过滤完如果没有后续则退出
        while(idx < size){
            if(s[idx] == ' ')
                idx++;
            else
                break;
        }
        if(idx == size)
            return 0;
        // 判断符号,如果没有后续则退出
        if(s[idx] == '+')
            idx++;
        else if(s[idx] == '-'){
            idx++;
            sign = -1;
        }
        if(idx == size)
            return 0;
        // 继续遍历寻找目标数字
        int result = 0;
        while(idx < size){
            // 遇到非数字退出
            if(s[idx] < '0' || s[idx] > '9')
                break;
            // 判断极限
            if(result > INT_MAX / 10 || (result == INT_MAX / 10 && (s[idx] - '0') >= (INT_MAX % 10)))
                return INT_MAX;
            if(result < INT_MIN / 10 || (result == INT_MIN / 10 && (s[idx] - '0') >= -(INT_MIN % 10)))
                return INT_MIN;
            // 字符转为数字
            result = result * 10 + sign * (s[idx] - '0');
            idx++;
        }
        return result;
    }
};

2) State machine

class Solution {
public:
    // 字符串转为整数
    int StrToInt(string s) {
        // 状态转移矩阵
        vector<vector<int>> states = {
            {0,1,2,3},
            {3,3,2,3},
            {3,3,2,3},
        }; 
        // 定义
        long result = 0;
        long top = INT_MAX;  
        long bottom = INT_MIN;
        int sign = 1;
        int size = int(s.length());
        // 状态从0开始
        int state = 0; 
        for(int i = 0; i < size; ++i){
            // 空格
            if(s[i] == ' '){
                state = states[state][0]; 
            }
            // 正负号 
            else if(s[i] == '-' || s[i] == '+'){ 
                state = states[state][1]; 
                if(state == 1){
                    sign = (s[i] == '-') ? -1 : 1;
                }    
            }
            // 数字
            else if(s[i] >= '0' && s[i] <= '9'){
                state = states[state][2]; 
            }   
            // 非法字符
            else{
                state = states[state][3]; 
            }
            // 状态为2时,表明在连续数字状态,进行数字累加
            if(state == 2){
                // 数字相加
                result = result * 10 + s[i] - '0'; 
                // 越界处理
                result = (sign == 1) ? min(result, top) : min(result, -bottom); 
            }
            // 状态为3时,说明后续无效,退出即可
            else if(state == 3)
                break;
        }
        return (int)sign * result;
    }
};

Guess you like

Origin blog.csdn.net/zhaitianbao/article/details/132844648