How to restore image to horizontal position? A complete collection of image rotation correction methods, the author of ddddocr is based on RotNet's deep learning recognition model of rotation verification code Rotate-Captcha-Crack

Image rotation correction model based on edge detection:

The model first performs edge detection on the image using an edge detection algorithm, then finds the straight line on the edge, and calculates the angle of the straight line. Finally, the angle of the image is corrected by rotating the image.

import cv2
import numpy as np

# 加载图像
img = cv2.imread('skewed_image.jpg')

# 转换为灰度图像
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 使用Canny算子查找边缘
edges = cv2.Canny(gray, 50, 150, apertureSize=3)

# 查找直线并计算旋转角度
lines = cv2.HoughLines(edges, 1, np.pi/180, 100)
angle = np.mean(lines[:, 0, 1]) * 180 / np.pi - 90

# 旋转图像进行校正
(rows, cols) = img.shape[:2]
center = (cols / 2, rows / 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
result = cv2.warpAffine(img, M, (cols, rows), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

# 显示图像
cv2.imshow('Original', img)
cv2.imshow('Corrected', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Image rotation correction model based on Hough transform:

The model is based on the Hough transform algorithm to detect straight lines in the image, and calculate the angle of the line, and then use the rotation matrix to rotate the image to the correct angle.

import cv2
import numpy as np

# 加载图像
img = cv2.imread('skewed_image.jpg')

# 转换为灰度图像
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 使用Canny算子查找边缘
edges = cv2.Canny(gray, 50, 150, apertureSize=3)

# 进行Hough变换,查找直线
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)

# 找到最长的线
max_len = 0
for line in lines:
    rho, theta = line[0]
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a * rho
    y0 = b * rho
    x1 = int(x0 + 1000 * (-b))
    y1 = int(y0 + 1000 * (a))
    x2 = int(x0 - 1000 * (-b))
    y2 = int(y0 - 1000 * (a))
    length = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)
    if length > max_len:
        max_len = length
        longest_line = [x1, y1, x2, y2]

# 计算旋转角度
dx = longest_line[2] - longest_line[0]
dy = longest_line[3] - longest_line[1]
angle = np.degrees(np.arctan2(dy, dx))

# 进行旋转校正
rows, cols = img.shape[:2]
rotation_matrix = cv2.getRotationMatrix2D((cols/2, rows/2), angle, 1)
result = cv2.warpAffine(img, rotation_matrix, (cols, rows), flags=cv2.INTER_CUBIC)

# 显示图像
cv2.imshow('Original Image', img)
cv2.imshow('Rotated Image', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Image rotation correction model based on template matching:

The model uses the local features of the image to find the optimal rotation angle. The main idea is to rotate the image to match it with a predefined template and find the minimum matching difference to determine the optimal rotation angle.

import cv2
import numpy as np

# 加载图像和模板
img = cv2.imread('skewed_image.jpg')
template = cv2.imread('template.jpg', 0)

# 计算模板的旋转矩阵
(h, w) = template.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, 45, 1.0)

# 旋转模板并计算SIFT特征点
template = cv2.warpAffine(template, M, (w, h))
sift = cv2.SIFT_create()
(kps, descs) = sift.detectAndCompute(template, None)

# 计算图像的SIFT特征点并进行匹配
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(kps2, descs2) = sift.detectAndCompute(gray, None)
bf = cv2.BFMatcher()
matches = bf.match(descs, descs2)

# 找到最优旋转矩阵
matches = sorted(matches, key=lambda x:x.distance)
src_pts = np.float32([kps[m.queryIdx].pt for m in matches])
dst_pts = np.float32([kps2[m.trainIdx].pt for m in matches])
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
theta = -np.degrees(np.arctan2(M[0, 1], M[0, 0]))

# 旋转图像进行校正
(rows, cols) = img.shape[:2]
center = (cols / 2, rows / 2)
M = cv2.getRotationMatrix2D(center, theta, 1.0)
result = cv2.warpAffine(img, M, (cols, rows), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

# 显示图像
cv2.imshow('Original', img)
cv2.imshow('Corrected', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Image rotation correction model based on feature point matching:

The model first uses a feature point detection algorithm to detect key points in an image and computes their descriptors. Then, the estimation and correction of the rotation angle is realized by calculating the distance sum between the feature points of the two images.

import cv2
import numpy as np

# 加载图像
img = cv2.imread('skewed_image.jpg')

# 使用SIFT特征点检测算法检测图像的关键点
sift = cv2.SIFT_create()
kp1, des1 = sift.detectAndCompute(img,None)

# 旋转图像并使用同样的方法检测关键点
[h, w] = img.shape[:2]
M = cv2.getRotationMatrix2D((w/2,h/2),45,1.0)
img_rot = cv2.warpAffine(img,M,(w,h))
kp2, des2 = sift.detectAndCompute(img_rot,None)

# 匹配关键点并计算旋转角度
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)
good = []
for m, n in matches:
    if m.distance < 0.5 * n.distance:
        good.append(m)
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
dst_pts = np.float32([kp2[m.trainIdx.pt for m in good]).reshape(-1, 1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
theta = -np.degrees(np.arctan2(M[0, 1], M[0, 0]))

#旋转图像进行校正
(rows, cols) = img.shape[:2]
center = (cols / 2, rows / 2)
M = cv2.getRotationMatrix2D(center, theta, 1.0)
result = cv2.warpAffine(img, M, (cols, rows), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

#显示图像
cv2.imshow('Original', img)
cv2.imshow('Corrected', result)
cv2.waitKey(0)
cv2.destroyAllWindows()

The implementation examples of the above two methods are implemented using functions and libraries in OpenCV. The method based on template matching rotates the image to the angle that best matches the template, while the method based on feature point matching detects the key point difference between the image and the rotated image, and estimates the best rotation angle for correction. Both of these methods can correct the rotation of the image to some extent, but it should be noted that if the rotation angle in the image is too large or there are severe non-linear distortions, the effect of both methods may be reduced. Therefore, when selecting an appropriate image rotation correction method, the specific application scenario, as well as factors such as the degree of rotation and the accuracy of the required correction should be considered.
The above four models can all be used for image rotation correction, but their advantages and disadvantages are different, and the appropriate model needs to be selected according to the specific application scenario.

https://www.python.org/downloads/windows/

Rotate-Captcha-Crack author's rotation captcha recognition model of ddddocr

Rotate-Captcha-Crack

https://github.com/Starry-OvO/rotate-captcha-crack

Chinese | English

CNN predicts the rotation angle of the picture, which can be used to crack the Baidu rotation verification code

Test results:

test_result

This repository implements three types of models:

name Backbone loss function Cross-domain test error (the smaller the better) size (MB)
RotNet ResNet50 cross entropy 1.1548° 92.7
RotNetR RegNetY 3.2GFLOPs cross entropy 1.2825° 69.8
RCCNet_v0_5 RegNetY 3.2GFLOPs MSE+cosine correction 42.7774° 68.7

RotNetImplemented for d4nst/RotNetPyTorch. RotNetROnly RotNetthe backbone is replaced on the basis of , and the number of categories is reduced to 180. The average prediction error obtained by training 64 epochs (taking 2 hours) on the Google Street View dataset1.2825° is . The current RCCNet_v0_5effect is poor, it is recommended to useRotNetR

The cross-domain test uses Google Street View / Landscape-Dataset as the training set, and Baidu verification code as the test set (special thanks to @xiangbei1997)

The Baidu verification code picture used in the demo is from RotateCaptchaBreak

Experience existing models

prepare the environment

  • GPU that supports CUDA10+ (if training is required, the video memory needs to be no less than 4G)

  • Make sure your Pythonversion>=3.8 <3.11

  • Make sure your PyTorchversion>=1.11

  • Pull code and install dependencies

git clone --depth=1 https://github.com/Starry-OvO/rotate-captcha-crack.git
cd ./rotate-captcha-crack
pip install .

Be careful not to miss installthe latter.

  • Alternatively, use a virtual environment
git clone --depth=1 https://github.com/Starry-OvO/rotate-captcha-crack.git
python -m venv ./rotate-captcha-crack --system-site-packages
cd ./rotate-captcha-crack
# 根据你的Shell类型挑选一个合适的脚本激活虚拟环境 例如./Script/Active.ps1
python -m pip install -U pip
pip install .

Download the pretrained model

Download the compressed package in Release./models and extract it to a folder

The file directory structure is similar to./models/RCCNet_v0_5/230228_20_07_25_000/best.pth

This project is still in the beta stage, and the model name will change frequently, so FileNotFoundErrorplease try to use git to roll back to the corresponding tag first.

Enter a captcha image and see the rotation

If your system does not have a GUI, try changing the debug method from displaying images to saving images

python test_captcha.py

use http server

  • Install additional dependencies
pip install aiohttp httpx[cli]
  • run server
python server.py
  • Open another command line window to send images
 httpx -m POST http://127.0.0.1:4396 -f img ./test.jpg

train new model

Prepare dataset

  • I picked up Google Street View and Landscape-Dataset directly here . You can also collect some landscape photos yourself and put them in a folder. There is no size requirement for the images

  • In train.pythe configuration dataset_rootvariable pointing to the folder with the pictures

  • There is no need for manual labeling, the dataset will automatically complete rectangle cropping, scaling and rotation while reading the picture

train

python train_RotNetR.py

Validate the model on the test set

python test_RotNetR.py

related articles

My Love Crack - A simple chat about the attack and defense of rotating verification codes

Guess you like

Origin blog.csdn.net/weixin_45934622/article/details/130381325