Personal summary of data processing (continuously added)

It's only for memo, please forgive me for being incoherent! ! !

1. Data cleaning must be done before data processing

The collected data format and label have various problems. If you import it directly into the algorithm, the program will make mistakes. If the explicit error is reported, it is easy to handle, but you are afraid of the hidden error, and you don’t know that the training data is wrong. . . . Anyway, the loss is also decreasing. . . . Very pit! ! !

For example: invalid label (wrong, out of bounds), wrongly formatted picture, wrongly formatted xml (no object leads to empty label[], wrong label name)


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324602801&siteId=291194637