Draw accuracy curve with Caffe's own program

My system environment: Win10+VS2013+Anaconda3(Python=3.5)+Caffe
The construction and configuration of the Caffe environment will not be repeated here. Lazy way: You can go directly to GitHub to download the compiled Caffe program.

In the training process of Caffe, in order to better observe or optimize the training process, it is best to graph it.
However, it is a bit troublesome to write the code to record the training process by yourself, and you may have to modify the source code of Caffe. In fact, Caffe already comes with such a gadget for lazy people like me to use.

1. Prepare the Caffe drawing widget

In the downloaded zip package caffe-windows , extract the following 3 files:

caffe-windows\tools\extra\parse_log.py
caffe-windows\tools\extra\extract_seconds.py
caffe-windows\tools\extra\plot_training_log.py.example

For convenience, rename plot_training_log.py.example to plot_training_log.py.

Since these 3 files are written in python2.7 and I am using Python3.5, there will be an error when running directly. (See: Difference between Python3.x and Python2.x )

Solution 1: Open parse_log.py and plot_training_log.py:
(1) Find the "print" command and change all "print xxx" to "print(xxx)".
(2) Find the "xrange" command and replace all "xrange" with "range".

Solution 2: Install python2.7 environment:
(1) Enter in cmd: conda create -n py2 python=2.7.
(2) Install the necessary packages for the environment: conda install matplotlib.
(3) Test: Activate the py2 environment first: enter in cmd activate py2, then enter plot_training_log.pyto see if an error is reported.

2. Output the Caffe training process to the log

To output the Caffe training process to a file, you can specify the log output directory ( -log_dir=./log, which .means "current folder") in the Caffe parameter. E.g:

caffe.exe train -solver=./lenet_solver.prototxt -log_dir=./log

After training, a long file will be found in the F:\Caffe directory, for example: "caffe.exe.HUA.Administrator.log.INFO.20170601-154746.10244".
It needs to be changed to a log file (required by plot_training_log.py): "caffe.exe.HUA.Administrator.log".

Note: Open it with Notepad or Notepad, and you will find that it is actually the pile of text displayed in the cmd window.

3. Parse the log log

Copy the decompressed three files to the directory where the log log is located, and then execute parse.py to parse the log file. E.g:

python parse_log.py caffe.exe.HUA.Administrator.log ./

The result of parsing is 2 files (output to the current file, which can be opened and viewed with Notepad):
caffe.exe.HUA.Administrator.log.test and caffe.exe.HUA.Administrator.log.train.

Note: These two files are the data files used when plotting with plot_training_log.py.
1. Because the first line of the generated xxx.train and xxx.test is a description character, the following script needs to be modified.
2. The delimiter is not a space, but ,, so it will be line.split()changed to line.split(',').

4. Draw the curve

4.1 Modify the python source code:

#源代码:
def load_data(data_file, field_idx0, field_idx1):
    data = [[], []]
    with open(data_file, 'r') as f:
        for line in f:
            line = line.strip()
            if line[0] != '#':
                fields = line.split()
                data[0].append(float(fields[field_idx0].strip()))
                data[1].append(float(fields[field_idx1].strip()))
    return data

#修改方案1:
def load_data(data_file, field_idx0, field_idx1):
    data = [[], []]
    with open(data_file, 'r') as f:
        lines = [line.strip() for line in f] #changed
        for line in lines[1:]: #changed
            if len(line)>0 and line[0] != '#': #changed
                fields = line.split(',') #changed
                data[0].append(float(fields[field_idx0].strip()))
                data[1].append(float(fields[field_idx1].strip()))
    return data

#修改方案2:
def load_data(data_file, field_idx0, field_idx1):
    data = [[], []]
    f = open(data_file,'r') #changed
    lines = f.readlines() #changed
    for i in range(1, len(lines)): #changed
        line = lines[i].strip() #changed
        if len(line)>0 and line[0] != '#': #changed
            fields = line.split(',') #changed
            data[0].append(float(fields[field_idx0].strip()))
            data[1].append(float(fields[field_idx1].strip()))
    fr.close() #changed
    return data
#源代码:for Python2.7
def random_marker():
    markers = mks.MarkerStyle.markers
    num = len(markers.keys())
    idx = random.randint(0, num - 1)
    return markers.keys()[idx]

#修改方案:for python3.5
def random_marker():
    markers = mks.MarkerStyle.markers
    num = len(markers.keys())
    idx = random.randint(0, num - 1)
    return list(markers.keys())[idx] #changed

Optional: In the windows environment, #os.system('%s %s' % (get_log_parsing_script(), path_to_log))this commented out.

4.2 Start drawing the curve:

Enter in cmd:

python plot_training_log.py 0 acc.png caffe.exe.HUA.Administrator.log

The "Test accuracy vs. Iters" curve during training can be generated. Among them, 0 represents the curve type, and the image name output by acc.png.
A variety of curve types can be drawn in Caffe, and the specific parameters are as follows:

Notes:  
    1. Supporting multiple logs.  
    2. Log file name must end with the lower-cased ".log".  
Supported chart types:  
    0: Test accuracy  vs. Iters  
    1: Test accuracy  vs. Seconds  
    2: Test loss  vs. Iters  
    3: Test loss  vs. Seconds  
    4: Train learning rate  vs. Iters  
    5: Train learning rate  vs. Seconds  
    6: Train loss  vs. Iters  
    7: Train loss  vs. Seconds  

wrong figure

If you look closely, you will find that the above picture is actually wrong: because the accuracy decreases with the number of iterations.

4.3 Correct the error field:

To fix the above errors, just adjust the corresponding items of the field in the script. Note: just modify the number and increase learning rate, not modify the key value!

def create_field_index():
    train_key = 'Train'
    test_key = 'Test'
    field_index = {train_key:{'Iters':0, 'Seconds':1, train_key + ' learning rate':2, train_key + ' loss':3}, #changed
                   test_key:{'Iters':0, 'Seconds':1, 'learning rate':2, test_key + ' accuracy':3, test_key + ' loss':4}} #changed
    fields = set()
    for data_file_type in field_index.keys():
        fields = fields.union(set(field_index[data_file_type].keys()))
    fields = list(fields)
    fields.sort()
    return field_index, fields

accuracy

learning rate

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325448585&siteId=291194637