OCR tagging method

Although the labeling work is not the responsibility of the algorithm engineer, the labeling requirements are guided by the algorithm engineer to implement the labeling staff. If there is a problem with the data labeled by the labeling staff, the model training will not converge, resulting in many problems. Therefore, the labeling requirements are very important. At the beginning, the labeling requirements must be sorted out and clearly stated, so as to reduce the repeated useless work of labeling work.

Next, let's talk about the specific labeling specifications:

box specification

  1. The same row of data is best marked in a box.

  1. According to the shape of the target, flexibly choose horizontal, vertical and oblique ways, and frame the text according to the four-point drawing method

  1. The frame should fit the text without too much free space

  1. Italic text should also fit as closely as possible.

Don’t consciously and cleverly frame such curved text like this, it’s a completely wrong way of labeling.

Instead, it should make the box fit the text better.

Labeling order

1. When there is semantic meaning: the order of the four points in the label box should be marked according to the order of the semantic meaning of the text.

2. When there is no semantic meaning: according to the reading order, mark the four points of the rectangle in the order of horizontal text-from left to right, vertical text-from top to bottom, and oblique text-from top to bottom.

Fuzzy and Deformed Words

1. Blurred text needs to be discarded (fuzzy fonts should not be associated with the context)

2. As long as the overall outline of the font is clear, and some strokes are blurred and transcribed, it can be qualified

3. Rules for replacing spaces in fuzzy/deformed characters

(1) If there are 3 fuzzy deformed characters in the normal text (ie ≥ 3 characters), the entire line will be discarded

(2) One (2 consecutive or 3 consecutive) fuzzy deformed fonts appearing in the middle of the normal text are replaced by only one space

(3) Fuzzy words/discarded words at the beginning/end of the sentence can be selected and discarded separately, and can be replaced by spaces

(4) Other situations that can be treated as fuzzy and deformed words and replaced by spaces: unrecognizable words caused by overlapping, inconsistent

Occlusion of transcription rules, missing text, etc.

block word

1. The words that are not blocked need to be transcribed, and the blocked place should be replaced with a space, and the entire line cannot be discarded

2. Occlusion width and frame rules

(1) When the distance between the occluded part does not reach 3 (i.e. <3) ​​characters, the unoccluded part must be framed together in the entire line. box as shown in figure 2

legal error

(2) When the distance between the occluded part reaches 3 (≥3) characters, the unoccluded part must be transcribed in frames.

(3) In the plain English text, the distance between the covered parts is based on the widest letter

(4) For Chinese and English mixed text, the interval between the covered parts is based on the widest Chinese character

3. Missing text

(1) ①~③The three-point rule applies to the edge of the picture and the middle of the picture

(2) ① The remaining part is ≥ 1/2, and the human-made objectively recognizable text is normally framed and transcribed;

(3) The remaining part ≥ 1/2, but the characters that cannot be recognized objectively or some simple characters are missing one horizontal, one vertical, and one apostrophe

The or truncated text is ambiguous and must be discarded;

(4) ②If the remaining part is less than 1/2, and the text is too small, and it is difficult to draw the discarded frame, it is not necessary to deal with it (it can be discarded); if the remaining part is less than 1/2,

but)

(5) ③ If the remaining part is less than 1/5, the larger text may not be processed

(6) ④Left- right/top-bottom structure of the text, if there is a cover, transliteration rules:

There is cover, but it does not affect the recognition of the whole word, and the whole word must be transcribed;

There is covering, which affects the recognition of the whole word. If the half of the uncovered part cannot be recognized as a word, the whole word must be discarded;

There is cover, which affects the recognition of the whole word. The half that is not covered can be recognized as a word, but the half that is covered, the remaining part is ≥ 1/5

, the whole word must be discarded;

There is cover, which affects the recognition of the whole word. The uncovered half can be recognized as a word, and the covered half, the remaining part <1/5

, the half-word must be transcribed, and the rest can be discarded or not processed

Supongo que te gusta

Origin blog.csdn.net/wangmengmeng99/article/details/129064255
Recomendado
Clasificación