Build Salient Object Detection step on pit road model based on the framework Keras

Copyright: https://blog.csdn.net/Dorothy_Xue/article/details/89006358

Set a direction, began to dry!

The next step on pit road will be a long period of time, from scratch, read the paper, write code, record pit to prevent fall in again.

Continually updated......


1. problems that need attention before knocking Code:

I decided, White would like to use this CV Keras framework, backend TensorFlow, IDE to use Pycharm, the best, do not accept refute ha ha.

Here Amway book "Python depth study", and Keras Chinese documents, links: https://keras-cn.readthedocs.io/en/latest/ , maybe there is almost no brain to build a simple model is enough Understand.

In addition, if you want to do research for a long time, it is recommended to each project to set up a separate Python runtime environment, prevent your project too much, environmental conflict. For example, I use the virtualenv, to the tutorial, wrote it myself, giant simple: https://blog.csdn.net/Dorothy_Xue/article/details/84111775 . When you create a project can also be used Pycharm, the interpreter there and do some.

Read more papers, Zhang knowledge, understanding what extent do the current direction, at least, the latest of those receiving more attention in the work to find out.


2. stepped on when coding pit road of the Long March open (manually click oversized Enter key

1. The first problem encountered is the data set . There are a lot of data sets, sweeping away a lot. So the question is, so many online data sets swing down, each dataset file folder mess, you see

                 

 

 

 

 

All in all it is a little inconsistent, specification point you need to get it, and divide it manually one by one would be too unrealistic, or the ability to play your code, when you do not practice your hand, write code division it soon enough. Into what look like? such:

  • train_img, train_gt
  • validation_img, validation_gt
  • test_img, test_gt

And clean and good-looking, easy to call the data model, nice.


2. When the model against the book take a look at the book to understand, to use their tasks in a written wrong, brain wide pain, a lot of pits, many did not notice a thing. For example, the input model . Possible against the book, is to simply call it train_data, train_label so, because these data keras the people have given you somehow, you just know that the model needs to receive input tensor is to exist Numpy array (designated focus this has never been on a model who really may not know, read the book also easy to overlook, such as the next).

Also! The data were sent to the model before training! To remember! Normalized it! (Is white stepped on a pit) because the read range of picture data obtained is between 0-255, GT is 0 and 255, but in order to facilitate the training model, accelerating convergence to normalization, or else I have not been normalized model point of view, loss turned out to be negative, negative hundreds of accuracy giant low, with 0 not bad. So to look at pre- test data ! In addition to about 255 is normalized between [0,1] just fine, easy.


3. With regard to the optimizer , a variety of books on Amway RMSprop, though good, but seniors say they look like the image saliency detection RMSprop not use, because this model has LSTM performance in such a structure better, on a significant image Adam just fine.


4. About Deconv2D . When deconvolution convolution kernel size should be divisible by step ! Otherwise prone checkerboard effect. Given a link to find out checkerboard effect: https://blog.csdn.net/Dorothy_Xue/article/details/79844990 , because no formal tone effect, is simple to understand familiarize yourself, take a simple model, so it is also still did not carefully analyze the reasons, but read the papers, there are mentioned this Notice.


5. Sometimes the code running on the card running machine, a quick fat four blanket? Memory collapse of it! However, with the GPU, I think the Austrian, best monitor GPU usage, the terminal will be able to achieve:

$watch -n 1 nvidia-smi

I set up one second display memory once every situation, dislike too short to command a change you want just fine. (At the time of running theme, the 100%


6. Sometimes a lot of data gathering, loaded into memory all at once can lead to a crash, it may be necessary to separate training. This time you need retraining current subset of data when a group of child rights dataset weight parameters on the call. Zenong blanket? How to achieve breakpoint training functions blanket?

With ModelCheckpoint

First look at ModelCheckpoint:

keras.callbacks.ModelCheckpoint( filepath,
    monitor='val_loss',
    verbose=0,
    save_best_only=False,
    save_weights_only=False,
    mode='auto',
    period=1
)

Wherein the parameters are as follows:

  1. filepath: string that holds the path model
  2. monitor: to be monitored values, val_acc or val_loss
  3. verbose: information presentation mode, shielding 0, 1 print
  4. save_best_only: set to True, save the best model to verify performance on set
  5. save_weights_only: set to True, save only the model weights; otherwise save the entire model (including the model structure, configuration information, etc.)
  6. mode: the parameters can be set three kinds of "auto", "min", "max", the decision judgment criteria best model performance save_best_only = True, for example, when the value of the monitoring val_acc, mode should be max, when the detection value val_loss is, the mode should be min. In auto mode, inferring evaluation criteria to be monitored automatically by the value of the name.
  7. period: Number epoch interval between the Checkpoint

Next is the code implementation process:

checkpoint = ModelCheckpoint('{}'.format(model_save_path) + '/{val_loss:.4f}.h5', verbose=1)

history = m.fit(train_img, train_gt, validation_data=(validation_img, validation_gt),
                    epochs=10, batch_size=1, verbose=1, callbacks=[checkpoint])

8. keras model order of processing color image is: the BGR

Here add lace knowledge of various image libraries:

  • In addition to opencv read color images are stored outside the order of RGB, all other image library reads color images are stored in RGB
  • In addition PIL read picture is outside img class, other pictures are coming in the library reading numpy matrix
  • The performance of the major image libraries, Big Brother comes opencv, both speed and comprehensive picture of operations, are all rolling exist, after all, it is a huge private library CV

9. The test phase Model. Predict () , a predicted time only view 1 * 3 * 224 * 224 (224 * 224 own set), the testing can not be set all at once thrown Predict () together with the prediction, write cycles, lined up one by one. I remember Load picture data processing minus the mean.


10. The loading GT , the GT has been read after Note, to add a

f = cv2.cvtColor(f, cv2.COLOR_RGB2GRAY)  # 将彩色图变成灰度图

Finished to remember normalization:

f = (f - np.min(f)) / (np.max(f) - np.min(f))

 

Guess you like

Origin blog.csdn.net/Dorothy_Xue/article/details/89006358