pytesseract psm 选项参数 - 代码天地

pytesseract psm 选项参数

其他 2018-10-30 20:35:34 阅读次数: 0

版权声明： https://blog.csdn.net/qq_26877377/article/details/81775000

最近写*车之家的爬虫，遇到动态，扭曲的自定义字符，以前直接比对不变的字符部分已经不行了，想了半天，对字符的操作不是很了解，所以就想到用orc来直接识别好了

遇到问题，使用pytesseract进行操作的时候，添加了中文的语言的选项，但是不添加psm参数时，识别不出来。经过一番查找找到

应该加上--psm 8 ，将整个图像当初一个汉字来操作

Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.

Here is a sample usage of image_to_string with multiple parameters.

target = pytesseract.image_to_string(image, lang='eng', boxes=False, \
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

猜你喜欢

转载自blog.csdn.net/qq_26877377/article/details/81775000

pytesseract psm 选项参数

详解pytesseract psm 选项参数

pytesseract库中的image_to_string函数各参数解释

图像_pytesseract

pytesseract 用法

安装pytesseract

【pytesseract 识别】

PSM-Net复现

ESTScan|EORF|Augustus|nr|PSM|

pytesseract 使用框架

Python pytesseract WinError 2

Pytesseract安装及初步使用

PyTesseract安装与使用

pytesseract图片转文字

使用pytesseract出现的问题

python 安装 pytesseract

python pytesseract使用

tesserocr与pytesseract模块的使用

pytesseract文字识别

pytesseract图像文字识别

pytesseract初使用

pytesseract+pillow

pytesseract：中文识别模块

pytesseract识别数字

Pytesseract学习笔记

如何配置pytesseract

echo(选项)(参数)选项

802.11协议精读10：节能模式（PSM）

倾向得分匹配PSM案例分析

pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

今日推荐

技术解析 GPT-4o：即时语音交互的突破与 GenAI 发展策略

开源大模型与闭源大模型

微信小程序授权登录获取用户的openid

亿级流量系统架构设计与实战

人工智能时代的程序设计教学与课程设计

纽交所技术问题致伯克希尔 (BRK.A) 显示跌近 100%

周排行

ORACLE 跟踪文件详细解释

20190924-LeetCode解数独题目分享

分治法实例-找下标，下标与对应值相等

安全测试学习笔记

JavaScript笔记：原型和原型链

在Linux中检查可用内存的5种方法

BUAA_OO_JML

mongodb创建用户、备份、恢复等

生活20190602

使用MoveIt!配置软件包在RViz中进行机器人运动规划

每日归档

更多

2024-06-09(0)

2024-06-08(0)

2024-06-07(0)

2024-06-06(0)

2024-06-05(0)

2024-06-04(10)

2024-06-03(52)

2024-06-02(4)

2024-06-01(60)

2024-05-31(47)