超强合集:OCR 文本检测干货汇总(含论文、源码、demo 等资源)

作者:handong1587
来源:GitHub
链接:
https://github.com/handong1587/handong1587.github.io/blob/master/_posts/deep_learning/2015-10-09-ocr.md


本文篇幅较长,建议收藏阅读,全文目录如下:
papers
Text Detection
Text Recognition
Text Detection+Recognition
Breaking Captcha
Handwritten Recognition
Plate Recognition
Blogs
Projects
Videos
Resources


# Papers

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

End-to-End Text Recognition with Convolutional Neural Networks

Word Spotting and Recognition with Embedded Attributes

file

Reading Text in the Wild with Convolutional Neural Networks

file

Deep structured output learning for unconstrained text recognition

  • intro: “propose an architecture consisting of a character sequence CNN and an N-gram encoding CNN which act on an input image in parallel and whose outputs are utilized along with a CRF model to recognize the text content present within the image.”
  • arxiv: http://arxiv.org/abs/1412.5903

Deep Features for Text Spotting

Reading Scene Text in Deep Convolutional Sequences

DeepFont: Identify Your Font from An Image

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images

End-to-End Interpretation of the French Street Name Signs Dataset

End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading

  • arxiv: https://arxiv.org/abs/1611.07385
  • Improving Text Proposals for Scene Images with Fully Convolutional Networks
  • intro: Universitat Autonoma de Barcelona (UAB) & University of Florence
  • intro: International Conference on Pattern Recognition (ICPR) - DLPR (Deep Learning for Pattern Recognition) workshop
  • arxiv: https://arxiv.org/abs/1702.05089

Scene Text Eraser

Attention-based Extraction of Structured Information from Street View Imagery

Implicit Language Model in LSTM for OCR

Text Detection

Object Proposals for Text Extraction in the Wild

Text-Attentional Convolutional Neural Networks for Scene Text Detection

Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network

Synthetic Data for Text Localisation in Natural Images

file

Scene Text Detection via Holistic, Multi-Channel Prediction

Detecting Text in Natural Image with Connectionist Text Proposal Network

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

Detecting Oriented Text in Natural Images by Linking Segments

Deep Direct Regression for Multi-Oriented Scene Text Detection

Cascaded Segmentation-Detection Networks for Word-Level Text Spotting

https://arxiv.org/abs/1704.00834

Text-Detection-using-py-faster-rcnn-framework

SSD-text detection: Text Detector

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R-PHOC: Segmentation-Free Word Spotting using CNN

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

EAST: An Efficient and Accurate Scene Text Detector

Deep Scene Text Detection with Connected Component Proposals

Single Shot Text Detector with Regional Attention

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

https://arxiv.org/abs/1709.03272

Deep Residual Text Detection Network for Scene Text

  • intro: IAPR International Conference on Document Analysis and Recognition (ICDAR) 2017. Samsung R&D Institute of China, Beijing
  • arxiv: https://arxiv.org/abs/1711.04147

Feature Enhancement Network: A Refined Scene Text Detector

ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene

https://arxiv.org/abs/1711.11249

Detecting Curve Text in the Wild: New Dataset and New Solution

FOTS: Fast Oriented Text Spotting with a Unified Network

https://arxiv.org/abs/1801.01671

PixelLink: Detecting Scene Text via Instance Segmentation

PixelLink: Detecting Scene Text via Instance Segmentation

Sliding Line Point Regression for Shape Robust Scene Text Detection

https://arxiv.org/abs/1801.09969

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

Single Shot TextSpotter with Explicit Alignment and Attention

Rotation-Sensitive Regression for Oriented Scene Text Detection

Detecting Multi-Oriented Text with Corner-based Region Proposals

An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches

https://arxiv.org/abs/1804.09003

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

Boosting up Scene Text Detectors with Guided CNN

https://arxiv.org/abs/1805.04132

Shape Robust Text Detection with Progressive Scale Expansion Network

A Single Shot Text Detector with Scale-adaptive Anchors

https://arxiv.org/abs/1807.01884

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping

TextContourNet: a Flexible and Effective Framework for Improving Scene Text Detection Architecture with a Multi-task Cascade

https://arxiv.org/abs/1809.03050

Correlation Propagation Networks for Scene Text Detection

https://arxiv.org/abs/1810.00304

Scene Text Detection with Supervised Pyramid Context Network

Improving Rotated Text Detection with Rotation Region Proposal Networks

https://arxiv.org/abs/1811.07031

Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks

https://arxiv.org/abs/1811.07432

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection

Detecting Text in the Wild with Deep Character Embedding Network


Text Recognition

Sequence to sequence learning for unconstrained scene text recognition

Drawing and Recognizing Chinese Characters with Recurrent Neural Network

Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition

Visual attention models for scene text recognition

https://arxiv.org/abs/1706.01487

Focusing Attention: Towards Accurate Text Recognition in Natural Images

Scene Text Recognition with Sliding Convolutional Character Models

https://arxiv.org/abs/1709.01727

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition

https://arxiv.org/abs/1710.03425

A New Hybrid-parameter Recurrent Neural Networks for Online Handwritten Chinese Character Recognition

https://arxiv.org/abs/1711.02809

AON: Towards Arbitrarily-Oriented Text Recognition

Arbitrarily-Oriented Text Recognition

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

https://arxiv.org/abs/1712.05404

Edit Probability for Scene Text Recognition

SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

https://arxiv.org/abs/1806.00578

Adaptive Adversarial Attack on Scene Text Recognition

ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification

https://arxiv.org/abs/1812.05824


Text Detection + Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework

FOTS: Fast Oriented Text Spotting with a Unified Network

https://arxiv.org/abs/1801.01671

Single Shot TextSpotter with Explicit Alignment and Attention

An end-to-end TextSpotter with Explicit Alignment and Attention

Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes

Scene Text Detection and Recognition: The Deep Learning Era

A Novel Integrated Framework for Learning both Text Detection and Recognition


Breaking Captcha

Using deep learning to break a Captcha system

Breaking reddit captcha with 96% accuracy

I’m not a human: Breaking the Google reCAPTCHA

Neural Net CAPTCHA Cracker

Recurrent neural networks for decoding CAPTCHAS

Reading irctc captchas with 95% accuracy using deep learning

端到端的OCR:基于CNN的实现

I Am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs

SimGAN-Captcha


Handwritten Recognition

High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps

Recognize your handwritten numbers

file

https://medium.com/@o.kroeger/recognize-your-handwritten-numbers-3f007cbe46ff#.jllz62xgu

Handwritten Digit Recognition using Convolutional Neural Networks in Python with Keras

MNIST Handwritten Digit Classifier

如何用卷积神经网络CNN识别手写数字集?

LeNet – Convolutional Neural Network in Python

Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention

MLPaint: the Real-Time Handwritten Digit Recognizer

file

Training a Computer to Recognize Your Handwriting

https://medium.com/@annalyzin/training-a-computer-to-recognize-your-handwriting-24b808fb584#.gd4pb9jk2

Using TensorFlow to create your own handwriting recognition engine

Building a Deep Handwritten Digits Classifier using Microsoft Cognitive Toolkit

Hand Writing Recognition Using Convolutional Neural Networks

Design of a Very Compact CNN Classifier for Online Handwritten Chinese Character Recognition Using DropWeight and Global Pooling

Handwritten digit string recognition by combination of residual network and RNN-CTC

https://arxiv.org/abs/1710.03112


Plate Recognition

Reading Car License Plates Using Deep Convolutional Neural Networks and LSTMs

Number plate recognition with Tensorflow

file

end-to-end-for-plate-recognition

Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN

  • intro: International Workshop on Advanced Image Technology, January, 8-10, 2017. Penang, Malaysia. Proceeding IWAIT2017
  • arxiv: https://arxiv.org/abs/1701.06439

License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks

Adversarial Generation of Training Examples for Vehicle License Plate Recognition

https://arxiv.org/abs/1707.03124

Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks

Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline

High Accuracy Chinese Plate Recognition Framework

LPRNet: License Plate Recognition via Deep Neural Networks

  • intrp=o: Intel IOTG Computer Vision Group
  • intro: works in real-time with recognition accuracy up to 95% for Chinese license plates: 3 ms/plate on nVIDIAR GeForceTMGTX 1080 and 1.3 ms/plate on IntelR CoreTMi7-6700K CPU.
  • arxiv: https://arxiv.org/abs/1806.10447

How many labeled license plates are needed?


Blogs

Applying OCR Technology for Receipt Recognition

file

Hacking MNIST in 30 lines of Python

Optical Character Recognition Using One-Shot Learning, RNN, and TensorFlow

https://blog.altoros.com/optical-character-recognition-using-one-shot-learning-rnn-and-tensorflow.html

Creating a Modern OCR Pipeline Using Computer Vision and Deep Learning

https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/


Projects

ocropy: Python-based tools for document analysis and OCR

Extracting text from an image using Ocropus

CLSTM : A small C++ implementation of LSTM networks, focused on OCR

OCR text recognition using tensorflow with attention

Digit Recognition via CNN: digital meter numbers detection

file

Attention-OCR: Visual Attention based OCR

file

umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm

Tesseract.js: Pure Javascript OCR for 62 Languages

file

DeepHCCR: Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel)

deep ocr: make a better chinese character recognition OCR than tesseract

https://github.com/JinpengLI/deep_ocr

Practical Deep OCR for scene text using CTPN + CRNN

https://github.com/AKSHAYUBHAT/DeepVideoAnalytics/blob/master/notebooks/OCR/readme.md

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

https://github.com//weinman/cnn_lstm_ctc_ocr

SSD_scene-text-detection


Videos

LSTMs for OCR

Resources

Deep Learning for OCR

https://github.com/hs105/Deep-Learning-for-OCR

Scene Text Localization & Recognition Resources

Scene Text Localization & Recognition Resources

awesome-ocr: A curated list of promising OCR resources

https://github.com/wanghaisheng/awesome-ocr

原创文章 44 获赞 83 访问量 11万+

猜你喜欢

转载自blog.csdn.net/Extremevision/article/details/86362066