[Likou Algorithm 20] 8. Find the subscript of the first matching item in the string (python direction)

Problem Description

insert image description here

Given the haystack sum of two strings needle, please find the subscript of the first occurrence of the string haystackin the string (the subscript starts from 0). Returns -1 needleif needle not part ofhaystack

insert image description here

Example 1

Input: haystack = "sadbutsad", needle = "sad"
Output: 0
Explanation: "sad" matches at subscripts 0 and 6.
The index of the first match is 0, so return 0.

Example 2

Input: haystack = "leetcode", needle = "leeto"
Output: -1
Explanation: "leeto" does not appear in "leetcode", so return -1.

hint

  • 1 <= haystack.length, needle.length <= 104
  • haystack and needle consist of lowercase English characters only

Idea analysis

insert image description here

When solving this problem, you can use the idea of ​​double pointers. First, we set two pointers to the starting positions of haystackand , respectively needle. Then, we start to traverse haystackthe string and compare whether the character at the current pointer position needleis the same as the character in the string. If they are the same, we continue to compare the next character until there is a complete match or needlethe string is traversed.

Proceed as follows:

  1. If needleit is an empty string, subscript 0 is returned.
  2. Calculate the lengths of haystackand , denoted as and , needlerespectively .nm
  3. Start traversing from haystackthe first character of , and the traversing range is n - m + 1.
  4. For each position i, jiterate over using the pointer needleand compare the characters of haystack[i+j]and needle[j]for equality. If they are equal, continue to compare the next character; if they are not equal, jump out of the loop.
  5. If jtraverses to needlethe end of , that is j == m, the first match is found, return ithe value of the current pointer minus needlethe length of m.
  6. haystackIf no match is found after traversing , -1 is returned, indicating needlethat is not haystackpart of .

This way, we can find the subscript of the first occurrence of the string needlein string .haystack

code analysis

insert image description here

  1. First, check the special case, if needleis an empty string, then directly return the subscript 0, because the empty string is a substring of any string.

  2. Calculate the lengths of haystackand needle, and assign them to the variables nand , respectively m. This is done to avoid accessing the function multiple times in the loop len().

  3. Use the outer loop for i in range(n - m + 1)to traverse haystackevery possible starting position of . Note that the scope of the outer loop is n - m + 1because when the number of remaining characters is less than needlethe length of , it is naturally impossible to match.

  4. In the inner loop for j in range(m), use the pointer jto traverse needleeach character of and haystackcompare it with the character at the corresponding position in . If the characters are equal, continue to compare the next character; if the characters are not equal, exit the inner loop.

  5. If the inner loop ends normally, that is, jthe traversal reaches needlethe end of , indicating that the first matching item is found, and ithe value of the current pointer can be returned.

  6. If no match is found after the outer loop ends, -1 is returned, indicating that needleis not haystacka substring of .

The idea of ​​this algorithm is to compare characters one by one until a match is found or the whole is traversed haystack. In the worst case (no match or match at the last starting position), approximately (n - m + 1) * mcharacter comparison operations are required.

full code

insert image description here

class Solution(object):
	def strStr(self, haystack, needle):
	    if not needle:  # 特殊情况判断:needle为空字符串
	        return 0
	    
	    n = len(haystack)  # haystack长度
	    m = len(needle)  # needle长度
	
	    for i in range(n - m + 1):  # 遍历haystack的每个起始位置
	        for j in range(m):  # 遍历needle的每个字符
	            if haystack[i+j] != needle[j]:  # 当前字符不匹配,跳出内层循环
	                break
	        else:
	            return i  # 内层循环正常结束,找到匹配项,返回当前指针i的值
	    
	    return -1  # 未找到匹配项,返回-1

Detailed analysis

class Solution(object):
    def strStr(self, haystack, needle):
        """
        :type haystack: str
        :type needle: str
        :rtype: int
        """

This code defines a class named and a method Solutionnamed in the class . The method accepts two parameters and , which represent the string to be searched and the substring to be searched respectively. The return type of the method is declared as .strStrstrStrhaystackneedleint

        if not needle:
            return 0

This code first checks the special case, that is, if needleis an empty string, it returns 0 directly. Because the empty string is a substring of any string.

        n = len(haystack)
        m = len(needle)

This code uses len()the function to get the lengths of the strings haystackand needle, and stores them in the variables nand , respectively m. This is done to improve efficiency by avoiding multiple calls to the function in subsequent loops len().

        for i in range(n - m + 1):
            j = 0
            while j < m and haystack[i+j] == needle[j]:
                j += 1
            
            if j == m:
                return i

This code uses two loops to match strings. The outer loop iterates over every possible starting position in forloop using the range . Because when the remaining characters are less than the length of , no match can be made.haystackn - m + 1needle

The inner loop uses whilethe loop to match by comparing haystackthe characters in with the characters in . needleThe loop condition is j < m and haystack[i+j] == needle[j], which means that when the pointer jis less than needlethe length of and the current character matches, continue looping.

If the inner loop ends normally, and jthe value of the pointer is equal to m, that is, the entire loop has been traversed needle, indicating that a matching substring has been found, and ithe value of the current pointer is returned.

        return -1

If no match is found after the outer loop ends, it means that needleis not haystacka substring of and returns -1.

Screenshot of running effect

call example

solution = Solution()
haystack = "sadbutsad"
needle = "sad"
haystack1 = "leetcode"
needle1 = "leeto"
print(solution.strStr(haystack, needle))
print(solution.strStr(haystack1, needle1))

operation result

insert image description here

end

insert image description here

Guess you like

Origin blog.csdn.net/qq_33681891/article/details/131830578