Andrew Ng "machine learning" Course summary (18) _ OCR Photos

18.1 problems described flowcharts

(1) image character recognition is to identify a character from a given picture.

(2) the process comprising:

1. Text Detection

2. The character segmentation (now do not need the segmentation)

3. character classification

Sliding window 18.2

Pedestrian detection, the sliding window is a fixed size of the first training input is determined whether a pedestrian network, and then cut in the size of an image picture, fed to the network; then continue to move the crop area, repeat the process, know clipped to the end, then scaled the crop, and then cut to the image scaling to enter the network, and so on.

First, sliding window for the same character recognition, to make the non-character character distinction, and then appropriately extended character area, the overlapping area are then combined, filtered according to the aspect ratio (height greater than that length), as shown below:

Then the text is divided, a generic training model, data set as follows:

After splitting a single character, using neural networks, support vector machine or a logistic regression classifier can be trained.

18.3 acquiring large amounts of data and manual data

(1) download fonts from the Internet, then add random background create examples to follow;

(2) use of existing data rotation, distortion, blur, etc. generate new data;

For more about the method of data:

(1) Synthesis of manual data;

(2) manual collection, mark data;

(3) all packets;

18.4 Upper Limit: which part of the pipeline to do next

Process follows below, have been 72% correct, if correct character to provide full text segmentation detected as an input, the system was found to enhance the accuracy of 89%, described in the text to work hard to detect.

The following table is entirely correct if every step would bring much improvement, if lifting the greater the instructions on this step takes more effort. First, the table in the text detection takes effort, then the character recognition, and the text segmentation have done well.

Guess you like

Origin www.cnblogs.com/henuliulei/p/11290260.html