test site
Test website: https://www.geetest.com/demo/slide-float.html
My giteer: Boss Qin (qin-laoda) - Gitee.com contains the code I wrote
Author's Note: Due to my personal reasons, the article feels too long. Later, I will divide a piece of knowledge into multiple articles, so that you can see it easily and clearly
There are two ways of thinking about the verification code: one is to operate through selenium, which is simple but the running speed is slow; the other is to use js to do it, which is more troublesome but fast
Let me explain how selenium does it:
The first step is to locate the place we want to operate on the webpage first, the HTML label canvas
Idea 1: To get two pictures, one with a gap and one without a gap, later we can use the ratio to get the distance that the slider needs to slide (because the size of the picture on the web page is different from the actual picture size) same, since that is scaled down or up)
Idea 2: Obtain the gap picture and the slider picture for comparison to get the sliding distance of the slider
If you are careful and cute, you will find that there is no link to download the slider picture
Let's solve it one by one, first we need to get the picture
We need to change the html code of the page as shown above:
However, what was actually sent back has not been changed. We need to change it with python code:
code:
# 修改css的样式
def get_css(self):
time.sleep(2)
# js语句执行 改变css
self.driver.execute_script("document.querySelectorAll('canvas')[2].style=''")# canvas标签的个数下标从0开始,更改第三个的style
result:
The code to get the picture is as follows:
def crop_image(self,img_name):
# 获取该图片的大小,可以通过div的框的大小进行
img=self.driver.find_element_by_xpath('//div[@class="geetest_canvas_img geetest_absolute"]')
# 把页面全屏化
# self.driver.maximize_window()
# 框的左上角的位置(x,y)
img_left_up=img.location
print(img_left_up)
# 获取div框的大小(一个字典)
print(img.size)
# 获取图片右下角的坐标
img_x,img_y=img_left_up["x"],img_left_up["y"]
img_right_below_x,img_right_below_y=img_x + img.size["width"],img_y + img.size["height"]
# 截屏(二进制)
screen_shot=self.driver.get_screenshot_as_png()
# 读取这个图片(读取内存中的二进制)
by_io=BytesIO(screen_shot)
# 打开图片
op_img=Image.open(by_io)
# 截取图片(截取下来的宽高),一个元组(a,b)
cap_img=op_img.crop((int(img_x),int(img_y),int(img_right_below_x),int(img_right_below_y)))
# 保存图片到文件中
cap_img.save(img_name)
return cap_img
Let me explain one by one
We need to be clear first, because we don’t have a link to download pictures, so we can only take screenshots, and we need to know where the screenshots are taken,
First we need to capture the size of the box of the div
img=self.driver.find_element_by_xpath('//div[@class="geetest_canvas_img geetest_absolute"]')
I positioned this div box
img_left_up=img.location
This line of code is more important, location returns a coordinate, which coordinate is returned, hey, let me tell you,
What is returned is the coordinates of the upper left corner of our positioning box (because the box and the picture overlap, it can be understood as the coordinates of the upper left corner of the picture),
However, if we want to take a screenshot, it is not enough to rely on just one coordinate. We also need to get the coordinates of the lower right corner. How to get it? We can get the size of the frame and use mathematical algorithms to get it
print(img.size) prints a dictionary that returns the size of the box,
img_x,img_y=img_left_up["x"],img_left_up["y"] img_right_below_x,img_right_below_y=img_x + img.size["width"],img_y + img.size["height"]
These two lines of code get the coordinates of the lower right corner
Let's take a picture then
screen_shot=self.driver.get_screenshot_as_png()
It can be seen that this screenshot method is a bit different from our previous screenshot method
driver.save_screenshot("Baidu.png")
It can be seen that this is stored in the file, so get_screenshot_as_png() is stored in memory (binary)
Then we need to read this picture (binary) and we need to import from io import BytesIO
BytesIO (binary)
After reading, we also need to open the picture,
We need to download the module pip install pillow
import module:
from PIL import Image
open the picture
Image.open(by_io)
Next we will intercept the picture l:
cap_img=crop((x1,y1,x2,y2))
save Picture
cap_img.save("file path")
Because we want to save two pictures, one original picture and one with slider
Original image:
Slider chart:
Comparison of pictures:
We want to compare the part without slider (use RGB three-color comparison color difference)
The following two pictures can let you understand the coordinate axes of the page
code show as below:
# 像素的对比
def compare_pixel(self,img1,img2,i,j):
pixel1=img1.load()[i,j]# 加载img1需要对比的像素并转换为RGB
pixel2=img2.load()[i,j]# 加载img2需要对比的像素并转换为RGB
# 对比误差范围
threshold=60
#RGB三颜色
if(pixel1[0]-pixel2[0])<=threshold and (pixel1[1]-pixel2[1])<=threshold and(pixel1[2]-pixel2[2])<=threshold:
return True
return False
def pixel(self,img1,img2):
left=60#从像素60的位置开始
has_find=False #没有发现那个凹槽
#图片的大小(截取下来的宽高),一个元组(a,b)
print(img1.size)
# 一个个像素对比
for i in range(left,img1.size[0]):#宽
if has_find:
break
for j in range(img2.size[1]):
if not self.compare_pixel(img1,img2,i,j):#img1对应的像素点(i,j)和img2对应的像素点(i,j)做对比(一个个的对比)
distance=i
print(distance)
has_find=True
# 只要碰到就立刻停止
break
return distance
Let me explain the second instance method first:
left=60 compare from pixel 50
has_find=False is used to judge the cycle and stop the cycle when it encounters a groove,
return returns the coordinate x of the first pixel that meets the groove, (this is only approximate and needs to be fine-tuned later)
The first instance method:
pixel1=img1.load()[i,j]# Load the pixels that need to be compared in img1 and convert to RGB pixel2=img2.load()[i,j]# Load the pixels that need to be compared in img2 and convert to RGB
load() loads the pixel position corresponding to [i,j]
threshold=60
chromatic aberration
RGB has three colors, the three colors are compared, each color difference is not more than 600
Since we got the approximate distance of movement, because of detection, we have to avoid detection and simulate people to slide
Simulate artificial sliding trajectory
code show as below:
#移动的轨迹
def trajectory(self,distance):
# d=vt+1/2at**2==>位移=初速度*时间+1/2加速度*时间的平方
# v1=v+at
# 思路:先加速,然后加速
# 移动轨迹
distance-=6
track=[]
# 当前移动的距离
current=0
# 移动到某个地方开始减速
mid=distance*3/4
# 间隔时间(加速时间)
t=0.1
# 初速度
v=0
pass
while current<distance:
if(current<mid):
a=random.randint(2,3)
else:
a=random.randint(4,5)
v0=v
v=v0+a*t
move=v0*t+1/2*a*t**2
current+=move
track.append(round(move))
return track
We can borrow the displacement formula of physics
d=vt+1/2at**2==>displacement=initial velocity*time+1/2 acceleration*time square
We need to customize initial velocity, acceleration time, acceleration
Idea: If the displacement is less than the distance to be moved, continue to move
We also need to design a place to increase the acceleration
Finally, we use the displacement formula to calculate the distance moved per acceleration t seconds, and design it as a motion trajectory
selenium operation mobile
code show as below:
def selenium(self,track):
# js过检
js = "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
self.driver.execute_script(js)
#运用行为链
ac=ActionChains(self.driver)
#定位
div=self.driver.find_element_by_class_name("geetest_slider_button")
# 按住不松
ac.click_and_hold(div)
# 移动
for i in track:
ac.move_by_offset(xoffset=i,yoffset=0)
time.sleep(0.1)
# 松开
ac.release()
# 提交行为
ac.perform()
If you read the selenium article I wrote before, you will know the idea,
Let me explain one by one:
Use behavioral chains to operate
import
from selenium.webdriver.common.action_chains import ActionChainsc
to create a behavior chain object
ac=ActionChains(self.driver)
To operate:
Locate div=self.driver.find_element_by_class_name("geetest_slider_button") # Press and hold ac.click_and_hold(div) # Move for i in track: ac.move_by_offset(xoffset=i,yoffset=0) time.sleep(0.1) # Release ac.release()
Note that ac.move_by_offset(xoffset=i,yoffset=0) is how much distance is moved each time,
The whole code is as follows:
from selenium import webdriver
import time
from PIL import Image
from io import BytesIO
import random
from selenium.webdriver.common.action_chains import ActionChains
class Radar(object):
def __init__(self):
self.url="https://www.geetest.com/demo/slide-float.html"
# 创建一个浏览器
self.driver=webdriver.Chrome()
# 打开网页
self.driver.get(self.url)
# 隐式等待
self.driver.implicitly_wait(5)
# 定位
def gps(self):
return self.driver.find_element_by_xpath('//span[@class="geetest_radar_tip_content"]')
# 点击
def click1(self):
self.gps().click()
# 修改css的样式
def get_css(self):
# 加个休息时间,让页面加载出来
time.sleep(2)
# js语句执行 改变css(把有缺口的图片隐藏起来)
self.driver.execute_script("document.querySelectorAll('canvas')[2].style=''")# canvas标签的个数下标从0开始,更改第三个的style
# 恢复有缺口的图片
def restore_img(self):
# 加个休息时间,让页面加载出来
time.sleep(2)
# 显示出有缺口的图片
self.driver.execute_script("document.querySelectorAll('canvas')[2].style='display:none'")
# 截取验证码
def crop_image(self,img_name):
# 获取该图片的大小,可以通过div的框的大小进行
img=self.driver.find_element_by_xpath('//div[@class="geetest_canvas_img geetest_absolute"]')
# 把页面全屏化
# self.driver.maximize_window()
# 框的左上角的位置(x,y)
img_left_up=img.location
print(img_left_up)
# 获取div框的大小(一个字典)
print(img.size)
# 获取图片右下角的坐标
img_x,img_y=img_left_up["x"],img_left_up["y"]
img_right_below_x,img_right_below_y=img_x + img.size["width"],img_y + img.size["height"]
# 截屏(二进制)
screen_shot=self.driver.get_screenshot_as_png()
# 读取这个图片(读取内存中的二进制)
by_io=BytesIO(screen_shot)
# 打开图片
op_img=Image.open(by_io)
# 截取图片(截取下来的宽高),一个元组(a,b)
cap_img=op_img.crop((int(img_x),int(img_y),int(img_right_below_x),int(img_right_below_y)))
# 保存图片到文件中
cap_img.save(img_name)
return cap_img
# 像素的对比
def compare_pixel(self,img1,img2,i,j):
pixel1=img1.load()[i,j]# 加载img1需要对比的像素并转换为RGB
pixel2=img2.load()[i,j]# 加载img2需要对比的像素并转换为RGB
# 对比误差范围
threshold=60
#RGB三颜色
if(pixel1[0]-pixel2[0])<=threshold and (pixel1[1]-pixel2[1])<=threshold and(pixel1[2]-pixel2[2])<=threshold:
return True
return False
def pixel(self,img1,img2):
left=60#从像素60的位置开始
has_find=False #没有发现那个凹槽
#图片的大小(截取下来的宽高),一个元组(a,b)
print(img1.size)
# 一个个像素对比
for i in range(left,img1.size[0]):#宽
if has_find:
break
for j in range(img2.size[1]):
if not self.compare_pixel(img1,img2,i,j):#img1对应的像素点(i,j)和img2对应的像素点(i,j)做对比(一个个的对比)
distance=i
print(distance)
has_find=True
# 只要碰到就立刻停止
break
return distance
#移动的轨迹
def trajectory(self,distance):
# d=vt+1/2at**2==>位移=初速度*时间+1/2加速度*时间的平方
# v1=v+at
# 思路:先加速,然后加速
# 移动轨迹
distance-=6
track=[]
# 当前移动的距离
current=0
# 移动到某个地方开始减速
mid=distance*3/4
# 间隔时间(加速时间)
t=0.1
# 初速度
v=0
pass
while current<distance:
if(current<mid):
a=random.randint(2,3)
else:
a=random.randint(4,5)
v0=v
v=v0+a*t
move=v0*t+1/2*a*t**2
current+=move
track.append(round(move))
return track
def selenium(self,track):
# js过检
js = "Object.defineProperty(navigator, 'webdriver', {get: () => undefined})"
self.driver.execute_script(js)
#运用行为链
ac=ActionChains(self.driver)
#定位
div=self.driver.find_element_by_class_name("geetest_slider_button")
# 按住不松
ac.click_and_hold(div)
# 移动
for i in track:
ac.move_by_offset(xoffset=i,yoffset=0)
time.sleep(0.1)
# 松开
ac.release()
# 提交行为
ac.perform()
def main(radar):
radar.gps()
radar.click1()
#下载无缺口图片
radar.get_css()
img1=radar.crop_image("./截图.png")
# 下载有缺口的图片
radar.restore_img()
img2=radar.crop_image("./截图1.png")
# 获取距离
distance=radar.pixel(img1,img2)
# 人工模拟滑动轨迹
track=radar.trajectory(distance)
print(track)
#selenium滑动
radar.selenium(track)
if __name__ == '__main__':
radar=Radar()
main(radar)
Summary: The most important idea of the slider verification code is to be able to calculate the moving distance. The selenium movement is just an auxiliary. Focus on the code for calculating the distance, because now there are many anti-climbing that can detect selenium features, and I will write another one later, sliding Block verification, interested little cuties can come and visit