直方图+滑动窗口方法

最近看到一个在直方图上使用滑动窗口的方法，来获取感兴趣区域（ROI），用这个滑动窗口的方法来获取ROI非常有效。

1.滑动窗口的原理

首先根据直方图，找到目标区域的大致位置，作为起始点。定义一个矩形区域，称之为“窗口”（图中棕色的部分），以这个起始点作为窗口的下边线中点，存储所有在方块中的白色点的横坐标。随后对存储的横坐标取均值，将该均值所在的列以及第一个”窗口“的上边缘所在的位置，作为下一个“窗口”的下边线中点，继续搜索。以此往复，直到把所有的行都搜索完毕。
原理演示视频

2.numpy.nonzero()

程序中主要用到了numpy中的这个函数，这个函数的作用是获得一个多维数组中非零元素的下标。

numpy.nonzero(a)
Return the indices of the elements that are non-zero.
Returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The values in a are always tested and returned in row-major, C-style order.
Parameters:
a : array_like
Input array.
Returns:
tuple_of_arrays : tuple
Indices of elements that are non-zero.

>>> x = np.array([[1,0,0], [0,2,0], [1,1,0]])
>>> x
array([[1, 0, 0],
       [0, 2, 0],
       [1, 1, 0]])
>>> np.nonzero(x)
(array([0, 1, 2, 2]), array([0, 1, 0, 1]))
######################################
>>> x[np.nonzero(x)]
array([1, 2, 1, 1])
>>> np.transpose(np.nonzero(x))
array([[0, 0],
       [1, 1],
       [2, 0],
       [2, 1])
######################################
>>> a = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a > 3
array([[False, False, False],
       [ True,  True,  True],
       [ True,  True,  True]])
>>> np.nonzero(a > 3)
(array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
###################################
>>> (a > 3).nonzero()
(array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))

3. 源程序

binary_warped是二值图像，nwindows是最大滑动多少个窗口，margin是窗口长的一半，minpix是每个窗口最少含有的非零像素的个数。

def find_lane_pixels(binary_warped, nwindows, margin, minpix):
    histogram = np.sum(binary_warped[binary_warped.shape[0]//2:,:], axis=0)
    out_img = np.dstack((binary_warped, binary_warped, binary_warped))
    base = np.argmax(histogram)

    nonzero = binary_warped.nonzero()
    nonzeroy = np.array(nonzero[0])
    nonzerox = np.array(nonzero[1])
    current = base
    window_height = np.int(binary_warped.shape[0]//nwindows)

    x_left, y_left, x_right, y_right = [], [], [], []
    for window in range(nwindows):
        win_y_low = binary_warped.shape[0] - (window+1)*window_height
        win_y_high = binary_warped.shape[0] - window*window_height
        win_x_low = current - margin
        win_x_high = current + margin
        good_inds = ((nonzeroy >= win_y_low) & (nonzeroy < win_y_high) & (nonzerox >= win_x_low) &  (nonzerox < win_x_high)).nonzero()[0]    ##nonzeroy, nonzerox是一维数组
        if len(good_inds) > minpix:
            cv2.rectangle(out_img,(win_x_low,win_y_low), (win_x_high,win_y_high),(0,255,0), 2)
            current = np.int(np.mean(nonzerox[good_inds]))
            x_left.append(win_x_low)
            y_left.append(win_y_low)
            x_right.append(win_x_high)
            y_right.append(win_y_high)

    x1, y1, x2, y2 = min(x_left), min(y_left), max(x_right), max(y_right)
    print(x1, y1, x2, y2)
    return (x1, y1, x2, y2), out_img

直方图+滑动窗口方法

1.滑动窗口的原理

2.numpy.nonzero()

3. 源程序

猜你喜欢