Text-related image processing is currently a direction with more applications. I usually read some information, and organize and collect links as follows for future use. The overall feeling is that Baidu's PaddleOCR is considered to be a relatively good open source OCR in China. In terms of papers, there will be some articles published every year, which can be viewed to grasp cutting-edge technologies.
Search for document image | Papers With Code
Automatic recognition of water meter readings, based on DB and CRNN methods
PDF table extraction Excel algorithm is open source, this open source project is worth millions
Heavy open source! Ping An Property & Casualty proposes TableMASTER: Table Recognition Master
A New Network for DB Scene Text Detection | AAAI2020
TrOCR: A New Generation of Optical Character Recognition Based on Transformer
GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox