〖Python Web Crawler Practical Combat㊶〗- Introduction to Jiexian Slider (3)

 Latest update

〖Python Web Crawler Practical Combat㊵〗- Introduction to Jiexian Slider (2)

Verification code

Currently, many websites take various measures to fight against crawlers, one of which is the use of verification codes. With the development of technology, there are more and more types of verification codes. The verification code was originally a simple graphic verification code composed of several numbers, and later English letters and confusion curves were added. Some websites may also see verification codes in Chinese characters, which makes identification more difficult. Today we will introduce the Jiexian verification code.

Preface

Our last article introduced you how to process image restoration. Today, we will talk about how tocalculate the distance the slider moves, we can identify it through open-cv, or other identification libraries, and we can also compare the pixels of the two pictures. You can directly use the code written by others, so I won’t go into details here.

Slider distance calculation

The first is the cv2 method. I found a better one written by others and it is available for personal testing. As for the installation of related libraries, I will search for tutorials myself.

import io
from PIL import Image
import cv2
import numpy as np

# 将 Image 转换为 Mat,通过 flag 可以控制颜色
def pilImgToCv2(img: Image.Image, flag=cv2.COLOR_RGB2BGR):
    return cv2.cvtColor(np.asarray(img), flag)

# 弹窗查看图片
def showImg(bg: cv2.Mat, name='test', delay=0):
    cv2.imshow(name, bg)
    cv2.waitKey(delay)
    cv2.destroyAllWindows()


def getDistance(img: Image.Image, slice: Image.Image):
    # 通过 pilImgToCv2 将图片置灰
    # 背景图和滑块图都需要做相同处理
    grayImg = pilImgToCv2(img, cv2.COLOR_BGR2GRAY)
    # showImg(grayImg) # 可以通过它来看处理后的图片效果
    graySlice = pilImgToCv2(slice, cv2.COLOR_BGR2GRAY)
    # 做边缘检测进一步降低干扰,阈值可以自行调整
    grayImg = cv2.Canny(grayImg, 255, 255)
    # showImg(grayImg) # 可以通过它来看处理后的图片效果
    graySlice = cv2.Canny(graySlice, 255, 255)
    # 通过模板匹配两张图片,找出缺口的位置
    result = cv2.matchTemplate(grayImg, graySlice, cv2.TM_CCOEFF_NORMED)
    maxLoc = cv2.minMaxLoc(result)[3]
    # 匹配出来的滑动距离
    distance = maxLoc[0]
    # 下面的逻辑是在图片画出一个矩形框来标记匹配到的位置,可以直观的看到匹配结果,去掉也可以的
    sliceHeight, sliceWidth = graySlice.shape[:2]
    # 左上角
    x, y = maxLoc
    # 右下角
    x2, y2 = x + sliceWidth, y + sliceHeight
    resultBg = pilImgToCv2(img, cv2.COLOR_RGB2BGR)
    cv2.rectangle(resultBg, (x, y), (x2, y2), (0, 0, 255), 2)
    # showImg(resultBg) # 可以通过它来看处理后的图片效果
    print(distance)
    return distance, resultBg
sliceimgpath = './slice.png'
imgpath = './缺口背景图片.png'
getDistance(Image.open(imgpath), Image.open(sliceimgpath))

Let’s introduce the second method, which is also the method used in this case. Since the resolutions of different computers are different, there will be a slight error. You have to modify the appropriate value in the code yourself. What I changed here is -6 .

def is_pixel_equal(image1, image2, x, y):
    """
    判断两个像素是否相同
    :param image1: 有缺口图片1
    :param image2: 无缺口图片2
    :param x: 位置x
    :param y: 位置y
    :return: 判断同一位置像素是否相同
    """
    pixel1 = image1.load()[x, y]
    pixel2 = image2.load()[x, y]
    threshold = 60
    if abs(pixel1[0] - pixel2[0]) < threshold and abs(pixel1[1] - pixel2[1]) < threshold and abs(
            pixel1[2] - pixel2[2]) < threshold:
        return True
    else:
        return False


def get_gap(image1, image2):
    """
    获取缺口偏移量
    :param image1: 不带缺口图片
    :param image2: 带缺口图片
    :return:返回缺口位置
    """
    image1 = Image.open(image1)
    image2 = Image.open(image2)
    left = 60
    print('验证码图片宽度和高度:', image1.size)
    for i in range(left, image1.size[0]):
        for j in range(image1.size[1]):
            judge_value = is_pixel_equal(image1, image2, i, j)
            if judge_value is False:
                left = i-6
                return left

Here, left = i - 6, what I tested here is minus 6, you may not need to adjust.

After I run it here, the calculated result is 55, which will be used in the reverse reverse of w later. What we are passing in here is a complete picture and a picture with a gap for comparison, and the idea of ​​​​cv2 is to remove the slider and compare it with the picture with a gap, because the orientation of each gap is different, I Personally, I still prefer the second one. If you like the first one, you can modify the relevant code yourself.

Code for slider track

I have read a lot of codes written by big guys, and I will also post here a slider trajectory code written by big guys. It is currently available and there is no problem.

import random
def __ease_out_expo(sep):
    '''
        轨迹相关操作
    '''
    if sep == 1:
        return 1
    else:
        return 1 - pow(2, -10 * sep)

def get_slide_track(distance):
    """
    根据滑动距离生成滑动轨迹
    :param distance: 需要滑动的距离
    :return: 滑动轨迹<type 'list'>: [[x,y,t], ...]
        x: 已滑动的横向距离
        y: 已滑动的纵向距离, 除起点外, 均为0
        t: 滑动过程消耗的时间, 单位: 毫秒
    """

    if not isinstance(distance, int) or distance < 0:
        raise ValueError(f"distance类型必须是大于等于0的整数: distance: {distance}, type: {type(distance)}")
    # 初始化轨迹列表
    slide_track = [
        [random.randint(-50, -10), random.randint(-50, -10), 0],
        [0, 0, 0],
    ]
    # 共记录count次滑块位置信息
    count = 40 + int(distance / 2)
    # 初始化滑动时间
    t = random.randint(50, 100)
    # 记录上一次滑动的距离
    _x = 0
    _y = 0
    for i in range(count):
        # 已滑动的横向距离
        x = round(__ease_out_expo(i / count) * distance)
        # y = round(__ease_out_expo(i / count) * 14)
        # 滑动过程消耗的时间
        t += random.randint(10, 50)
        if x == _x:
            continue
        slide_track.append([x, _y, t])
        _x = x
    slide_track.append(slide_track[-1])
    return slide_track

Although this trajectory has some minor flaws, it only needs to pass the platform's detection. When you run it, you will find that its y values ​​are all 0, but the ones actually generated on the web page are not like this. The y values ​​fluctuate. In the code here, we only need to pass in the distance parameter. It is the distance calculated by our identification above. Here, we can also get passtime. The value here is also very important, which will be discussed later.

Summarize

Today we mainly talked about the calculation of the slider distance, as well as the calculation of the slider distance. Our next article will focus on reversing the value of w,I am here To clarify, it was said before that the first two w values ​​can be left blank. However, I tried it here and found that they cannot be left blank. Most of the teaching videos I watched said they could be left blank, which is wrong. At least, I left the first two w values ​​blank. After the two w values ​​are completed, no error will be reported.

geetest_1701956570696({"success": 0, "message": "forbidden"})

Even if the value of our third w is correct, it still reports forbidden. At first, I thought that the value of w was wrong, and I also suspected that there was something wrong with the slider track. This took me a week. I hope everyone will avoid this pitfall. When we deduct the value of w, there is a random number, which must be consistent before and after. This is also a pitfall.

Next episode preview:

Next, my idea is to focus on the generation of w, because many bloggers have posted similar ideas. Maybe the function name has changed, and the encrypted content has some slight changes. I will reverse the three w's. The analysis process will be sent out. Due to copyright issues, I will not send out the complete js code. There is no problem if you follow my ideas. Let me first show you pictures of my success.

I also tested the slider for sending verification codes on a mobile phone on a certain platform. This website only needs the last w. After everyone has learned the reverse engineering of this official website, other js can be used with basically simple modifications.


Solemn statement: This article is for communication and learning purposes only and may not be used for illegal purposes.


Guess you like

Origin blog.csdn.net/BROKEN__Y/article/details/134863809