前言

最近打算重新跟着官方教程学习一下caffe，顺便也自己翻译了一下官方的文档。自己也做了一些标注，都用斜体标记出来了。中间可能额外还加了自己遇到的问题或是运行结果之类的。欢迎交流指正，拒绝喷子！
官方教程的原文链接：
http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/brewing-logreg.ipynb

这篇教程中对了sklearn的逻辑回归函数和使用caffe直接定义的逻辑回归网络进行了比较。其中有些函数因为库的更新导致调用方式不同，我在程序中修改并注释了出来，总体来说是很容易看懂的。

Brewing Logistic Regression then Going Deeper

尽管Caffe是专门为深度网络设计的,但它同样也可以表示”浅层模型”,比如用于分类的逻辑回归。我们将对合成的数据进行逻辑回归，并对保存在HDF5格式的文件中，以便将数据提交给caffe。一旦模型已经完成了，我们还会添加一些层到该模型上以提高准确性。这也正是caffe所要做的：定义模型，进行实验，部署模型。

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# 指定caffe路径
caffe_root = '/home/xhb/caffe/caffe/'  # this file should be run from {caffe_root}/examples (otherwise change this line)

import os
# os.chdir('..')
os.chdir(caffe_root)

import sys
sys.path.insert(0, './python')
import caffe


import os
import h5py
import shutil
import tempfile

import sklearn
import sklearn.datasets
import sklearn.linear_model

import pandas as pd

合成具有2个信息特征和2个噪声特征的4维向量组成的数据集，共计10000个样本，用于二分类。

X, y = sklearn.datasets.make_classification(
    n_samples=10000, n_features=4, n_redundant=0, n_informative=2,
    n_clusters_per_class=2, hypercube=False, random_state=0
)

# Split into train and test
X, Xt, y, yt = sklearn.model_selection.train_test_split(X, y)

# Visualize sample of the data
ind = np.random.permutation(X.shape[0])[:1000]
df = pd.DataFrame(X[ind])
# _ = pd.plotting.scatter_matrix(df, figsize=(9, 9), diagonal='kde', marker='o', s=40, alpha=.4, c=y[ind])
_ = pd.tools.plotting.scatter_matrix(df, figsize=(9, 9), diagonal='kde', marker='o', s=40, alpha=.4, c=y[ind])

png

通过随机梯度下降法(SGD)训练和评估sklearn的逻辑回归算法。计时，并检查一下算法的准确率把。

%%timeit
# Train and test the scikit-learn SGD logistic regression.
# clf = sklearn.linear_model.SGDClassifier(
#     loss='log', n_iter=1000, penalty='l2', alpha=5e-4, class_weight='balanced'
# )

# n_iter在新版本中被替换成max_iter，会在将来的版本中移除
clf = sklearn.linear_model.SGDClassifier(
    loss='log', max_iter=1000, penalty='l2', alpha=5e-4, class_weight='balanced'
)

clf.fit(X, y)
yt_pred = clf.predict(Xt)
print('Accuracy: {:.3f}'.format(sklearn.metrics.accuracy_score(yt, yt_pred)))

Accuracy: 0.770
Accuracy: 0.769
Accuracy: 0.770
Accuracy: 0.770
1 loop, best of 3: 1.07 s per loop

把数据集保存为HDF5格式以导入Caffe。

# Write out the data to HDF5 files in a temp directory.
# This file is assumed to be caffe_root/examples/hdf5_classification.ipynb
dirname = os.path.abspath('./examples/hdf5_classification/data')
if not os.path.exists(dirname):
    os.makedirs(dirname)

train_filename = os.path.join(dirname, 'train.h5')
test_filename = os.path.join(dirname, 'test.h5')

# HDF5DataLayer source should be a file containing a list of HDF5 filenames.
# To show this off, we'll list the same data file twice.
with h5py.File(train_filename, 'w') as f:
    f['data'] = X
    f['label'] = y.astype(np.float32)
with open(os.path.join(dirname, 'train.txt'), 'w') as f:
    f.write(train_filename + '\n')
    f.write(train_filename + '\n')

# HDF5 is pretty efficient, but can be further compressed.
comp_kwargs = {'compression': 'gzip', 'compression_opts': 1}
with h5py.File(test_filename, 'w') as f:
    f.create_dataset('data', data=Xt, **comp_kwargs)
    f.create_dataset('label', data=yt.astype(np.float32), **comp_kwargs)
with open(os.path.join(dirname, 'test.txt'), 'w') as f:
    f.write(test_filename + '\n')

我们通过Python网络规范定义Caffe中的逻辑回归模型。这是一种快速且自然的定义网络的方式，避免了我们手动编辑protobuf模型。

from caffe import layers as L
from caffe import params as P

def logreg(hdf5, batch_size):
    # logistic regression: data, matrix multiplication, and 2-class softmax loss
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    n.ip1 = L.InnerProduct(n.data, num_output=2, weight_filler=dict(type='xavier'))
    n.accuracy = L.Accuracy(n.ip1, n.label)
    return n.to_proto()

# 训练网络
train_net_path = 'examples/hdf5_classification/logreg_auto_train.prototxt'
with open(train_net_path, 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/train.txt', 10)))

# 测试网络
test_net_path = 'examples/hdf5_classification/logreg_auto_test.prototxt'
with open(test_net_path, 'w') as f:
    f.write(str(logreg('examples/hdf5_classification/data/test.txt', 10)))

现在我们要定义“solver”了，这个文件中会指明我们在哪里定义了上面的训练和测试网络，并设置了训练过程中的一些相关参数。

from caffe.proto import caffe_pb2

def solver(train_net_path, test_net_path):
    s = caffe_pb2.SolverParameter()

    # Specify locations of the train and test networks.
    s.train_net = train_net_path
    s.test_net.append(test_net_path)

    s.test_interval = 1000  # Test after every 1000 training iterations.
    s.test_iter.append(250) # Test 250 "batches" each time we test.

    s.max_iter = 10000      # # of times to update the net (training iterations)

    # Set the initial learning rate for stochastic gradient descent (SGD).
    s.base_lr = 0.01        

    # Set `lr_policy` to define how the learning rate changes during training.
    # Here, we 'step' the learning rate by multiplying it by a factor `gamma`
    # every `stepsize` iterations.
    s.lr_policy = 'step'
    s.gamma = 0.1
    s.stepsize = 5000

    # Set other optimization parameters. Setting a non-zero `momentum` takes a
    # weighted average of the current gradient and previous gradients to make
    # learning more stable. L2 weight decay regularizes learning, to help prevent
    # the model from overfitting.
    s.momentum = 0.9
    s.weight_decay = 5e-4

    # Display the current training loss and accuracy every 1000 iterations.
    s.display = 1000

    # Snapshots are files used to store networks we've trained.  Here, we'll
    # snapshot every 10K iterations -- just once at the end of training.
    # For larger networks that take longer to train, you may want to set
    # snapshot < max_iter to save the network and training state to disk during
    # optimization, preventing disaster in case of machine crashes, etc.
    s.snapshot = 10000
    s.snapshot_prefix = 'examples/hdf5_classification/data/train'

    # We'll train on the CPU for fair benchmarking against scikit-learn.
    # Changing to GPU should result in much faster training!
    s.solver_mode = caffe_pb2.SolverParameter.CPU

    return s

solver_path = 'examples/hdf5_classification/logreg_solver.prototxt'
with open(solver_path, 'w') as f:
    f.write(str(solver(train_net_path, test_net_path)))

是时候评估一下我们在caffe中使用python接口定义的逻辑回归模型了。

%%timeit
caffe.set_mode_cpu()
solver = caffe.get_solver(solver_path)
solver.solve()

accuracy = 0
batch_size = solver.test_nets[0].blobs['data'].num
test_iters = int(len(Xt) / batch_size)
for i in range(test_iters):
    solver.test_nets[0].forward()
    accuracy += solver.test_nets[0].blobs['accuracy'].data
accuracy /= test_iters

print("Accuracy: {:.3f}".format(accuracy))

Accuracy: 0.538
Accuracy: 0.366
Accuracy: 0.472
Accuracy: 0.524
Accuracy: 0.434
Accuracy: 0.408
Accuracy: 0.654
Accuracy: 0.416
Accuracy: 0.569
Accuracy: 0.450
Accuracy: 0.505
Accuracy: 0.438
Accuracy: 0.596
Accuracy: 0.602
Accuracy: 0.528
Accuracy: 0.488
Accuracy: 0.551
Accuracy: 0.459
Accuracy: 0.585
Accuracy: 0.500
Accuracy: 0.416
Accuracy: 0.632
Accuracy: 0.528
Accuracy: 0.542
Accuracy: 0.601
Accuracy: 0.316
Accuracy: 0.592
Accuracy: 0.700
Accuracy: 0.530
Accuracy: 0.682
Accuracy: 0.603
Accuracy: 0.553
Accuracy: 0.406
Accuracy: 0.418
Accuracy: 0.546
Accuracy: 0.668
Accuracy: 0.660
Accuracy: 0.497
Accuracy: 0.610
Accuracy: 0.620
Accuracy: 0.570
10 loops, best of 3: 83.3 ms per loop

通过命令行运行一遍，训练并评估网络。

扫描二维码关注公众号，回复： 908576 查看本文章

!./build/tools/caffe train -solver examples/hdf5_classification/logreg_solver.prototxt

I0314 20:15:20.180691  9415 caffe.cpp:197] Use CPU.
I0314 20:15:20.180891  9415 solver.cpp:45] Initializing solver from parameters: 
train_net: "examples/hdf5_classification/logreg_auto_train.prototxt"
test_net: "examples/hdf5_classification/logreg_auto_test.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: CPU
train_state {
  level: 0
  stage: ""
}
I0314 20:15:20.180980  9415 solver.cpp:92] Creating training net from train_net file: examples/hdf5_classification/logreg_auto_train.prototxt
I0314 20:15:20.181069  9415 net.cpp:51] Initializing net from parameters: 
state {
  phase: TRAIN
  level: 0
  stage: ""
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip1"
  bottom: "label"
  top: "accuracy"
}
I0314 20:15:20.181141  9415 layer_factory.hpp:77] Creating layer data
I0314 20:15:20.181154  9415 net.cpp:84] Creating Layer data
I0314 20:15:20.181159  9415 net.cpp:380] data -> data
I0314 20:15:20.181175  9415 net.cpp:380] data -> label
I0314 20:15:20.181185  9415 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I0314 20:15:20.181211  9415 hdf5_data_layer.cpp:94] Number of HDF5 files: 2
I0314 20:15:20.181999  9415 hdf5.cpp:32] Datatype class: H5T_FLOAT
I0314 20:15:20.182400  9415 net.cpp:122] Setting up data
I0314 20:15:20.182418  9415 net.cpp:129] Top shape: 10 4 (40)
I0314 20:15:20.182422  9415 net.cpp:129] Top shape: 10 (10)
I0314 20:15:20.182426  9415 net.cpp:137] Memory required for data: 200
I0314 20:15:20.182433  9415 layer_factory.hpp:77] Creating layer ip1
I0314 20:15:20.182446  9415 net.cpp:84] Creating Layer ip1
I0314 20:15:20.182451  9415 net.cpp:406] ip1 <- data
I0314 20:15:20.182462  9415 net.cpp:380] ip1 -> ip1
I0314 20:15:20.182797  9415 net.cpp:122] Setting up ip1
I0314 20:15:20.182804  9415 net.cpp:129] Top shape: 10 2 (20)
I0314 20:15:20.182807  9415 net.cpp:137] Memory required for data: 280
I0314 20:15:20.182821  9415 layer_factory.hpp:77] Creating layer accuracy
I0314 20:15:20.182826  9415 net.cpp:84] Creating Layer accuracy
I0314 20:15:20.182832  9415 net.cpp:406] accuracy <- ip1
I0314 20:15:20.182837  9415 net.cpp:406] accuracy <- label
I0314 20:15:20.182847  9415 net.cpp:380] accuracy -> accuracy
I0314 20:15:20.182855  9415 net.cpp:122] Setting up accuracy
I0314 20:15:20.182859  9415 net.cpp:129] Top shape: (1)
I0314 20:15:20.182863  9415 net.cpp:137] Memory required for data: 284
I0314 20:15:20.182868  9415 net.cpp:200] accuracy does not need backward computation.
I0314 20:15:20.182874  9415 net.cpp:200] ip1 does not need backward computation.
I0314 20:15:20.182879  9415 net.cpp:200] data does not need backward computation.
I0314 20:15:20.182883  9415 net.cpp:242] This network produces output accuracy
I0314 20:15:20.182889  9415 net.cpp:255] Network initialization done.
I0314 20:15:20.182951  9415 solver.cpp:190] Creating test net (#0) specified by test_net file: examples/hdf5_classification/logreg_auto_test.prototxt
I0314 20:15:20.182987  9415 net.cpp:51] Initializing net from parameters: 
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip1"
  bottom: "label"
  top: "accuracy"
}
I0314 20:15:20.183029  9415 layer_factory.hpp:77] Creating layer data
I0314 20:15:20.183037  9415 net.cpp:84] Creating Layer data
I0314 20:15:20.183060  9415 net.cpp:380] data -> data
I0314 20:15:20.183069  9415 net.cpp:380] data -> label
I0314 20:15:20.183076  9415 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/test.txt
I0314 20:15:20.183092  9415 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0314 20:15:20.184141  9415 net.cpp:122] Setting up data
I0314 20:15:20.184157  9415 net.cpp:129] Top shape: 10 4 (40)
I0314 20:15:20.184165  9415 net.cpp:129] Top shape: 10 (10)
I0314 20:15:20.184168  9415 net.cpp:137] Memory required for data: 200
I0314 20:15:20.184175  9415 layer_factory.hpp:77] Creating layer ip1
I0314 20:15:20.184182  9415 net.cpp:84] Creating Layer ip1
I0314 20:15:20.184187  9415 net.cpp:406] ip1 <- data
I0314 20:15:20.184195  9415 net.cpp:380] ip1 -> ip1
I0314 20:15:20.184208  9415 net.cpp:122] Setting up ip1
I0314 20:15:20.184214  9415 net.cpp:129] Top shape: 10 2 (20)
I0314 20:15:20.184218  9415 net.cpp:137] Memory required for data: 280
I0314 20:15:20.184227  9415 layer_factory.hpp:77] Creating layer accuracy
I0314 20:15:20.184233  9415 net.cpp:84] Creating Layer accuracy
I0314 20:15:20.184237  9415 net.cpp:406] accuracy <- ip1
I0314 20:15:20.184242  9415 net.cpp:406] accuracy <- label
I0314 20:15:20.184247  9415 net.cpp:380] accuracy -> accuracy
I0314 20:15:20.184254  9415 net.cpp:122] Setting up accuracy
I0314 20:15:20.184259  9415 net.cpp:129] Top shape: (1)
I0314 20:15:20.184263  9415 net.cpp:137] Memory required for data: 284
I0314 20:15:20.184267  9415 net.cpp:200] accuracy does not need backward computation.
I0314 20:15:20.184273  9415 net.cpp:200] ip1 does not need backward computation.
I0314 20:15:20.184276  9415 net.cpp:200] data does not need backward computation.
I0314 20:15:20.184280  9415 net.cpp:242] This network produces output accuracy
I0314 20:15:20.184286  9415 net.cpp:255] Network initialization done.
I0314 20:15:20.184298  9415 solver.cpp:57] Solver scaffolding done.
I0314 20:15:20.184310  9415 caffe.cpp:239] Starting Optimization
I0314 20:15:20.184315  9415 solver.cpp:293] Solving 
I0314 20:15:20.184319  9415 solver.cpp:294] Learning Rate Policy: step
I0314 20:15:20.184334  9415 solver.cpp:351] Iteration 0, Testing net (#0)
I0314 20:15:20.184962  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.184989  9415 solver.cpp:239] Iteration 0 (-4.27671e-38 iter/s, 0s/1000 iters), loss = 0
I0314 20:15:20.184999  9415 solver.cpp:258]     Train net output #0: accuracy = 0.5
I0314 20:15:20.185014  9415 sgd_solver.cpp:112] Iteration 0, lr = 0.01
I0314 20:15:20.188537  9415 solver.cpp:351] Iteration 1000, Testing net (#0)
I0314 20:15:20.189131  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.189147  9415 solver.cpp:239] Iteration 1000 (250000 iter/s, 0.004s/1000 iters), loss = 0
I0314 20:15:20.189157  9415 solver.cpp:258]     Train net output #0: accuracy = 0.3
I0314 20:15:20.189163  9415 sgd_solver.cpp:112] Iteration 1000, lr = 0.01
I0314 20:15:20.192469  9415 solver.cpp:351] Iteration 2000, Testing net (#0)
I0314 20:15:20.193065  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.193083  9415 solver.cpp:239] Iteration 2000 (333333 iter/s, 0.003s/1000 iters), loss = 0
I0314 20:15:20.193092  9415 solver.cpp:258]     Train net output #0: accuracy = 0.4
I0314 20:15:20.193099  9415 sgd_solver.cpp:112] Iteration 2000, lr = 0.01
I0314 20:15:20.196584  9415 solver.cpp:351] Iteration 3000, Testing net (#0)
I0314 20:15:20.197191  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.197218  9415 solver.cpp:239] Iteration 3000 (250000 iter/s, 0.004s/1000 iters), loss = 0
I0314 20:15:20.197227  9415 solver.cpp:258]     Train net output #0: accuracy = 0.5
I0314 20:15:20.197234  9415 sgd_solver.cpp:112] Iteration 3000, lr = 0.01
I0314 20:15:20.200424  9415 solver.cpp:351] Iteration 4000, Testing net (#0)
I0314 20:15:20.201073  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.201105  9415 solver.cpp:239] Iteration 4000 (333333 iter/s, 0.003s/1000 iters), loss = 0
I0314 20:15:20.201134  9415 solver.cpp:258]     Train net output #0: accuracy = 0.3
I0314 20:15:20.201141  9415 sgd_solver.cpp:112] Iteration 4000, lr = 0.01
I0314 20:15:20.204327  9415 solver.cpp:351] Iteration 5000, Testing net (#0)
I0314 20:15:20.204933  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.204959  9415 solver.cpp:239] Iteration 5000 (333333 iter/s, 0.003s/1000 iters), loss = 0
I0314 20:15:20.204968  9415 solver.cpp:258]     Train net output #0: accuracy = 0.4
I0314 20:15:20.204974  9415 sgd_solver.cpp:112] Iteration 5000, lr = 0.001
I0314 20:15:20.208395  9415 solver.cpp:351] Iteration 6000, Testing net (#0)
I0314 20:15:20.209000  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.209023  9415 solver.cpp:239] Iteration 6000 (250000 iter/s, 0.004s/1000 iters), loss = 0
I0314 20:15:20.209036  9415 solver.cpp:258]     Train net output #0: accuracy = 0.5
I0314 20:15:20.209043  9415 sgd_solver.cpp:112] Iteration 6000, lr = 0.001
I0314 20:15:20.214056  9415 solver.cpp:351] Iteration 7000, Testing net (#0)
I0314 20:15:20.215221  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.215267  9415 solver.cpp:239] Iteration 7000 (166667 iter/s, 0.006s/1000 iters), loss = 0
I0314 20:15:20.215284  9415 solver.cpp:258]     Train net output #0: accuracy = 0.3
I0314 20:15:20.215293  9415 sgd_solver.cpp:112] Iteration 7000, lr = 0.001
I0314 20:15:20.221611  9415 solver.cpp:351] Iteration 8000, Testing net (#0)
I0314 20:15:20.222805  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.222841  9415 solver.cpp:239] Iteration 8000 (142857 iter/s, 0.007s/1000 iters), loss = 0
I0314 20:15:20.222854  9415 solver.cpp:258]     Train net output #0: accuracy = 0.4
I0314 20:15:20.222862  9415 sgd_solver.cpp:112] Iteration 8000, lr = 0.001
I0314 20:15:20.229424  9415 solver.cpp:351] Iteration 9000, Testing net (#0)
I0314 20:15:20.230543  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.230583  9415 solver.cpp:239] Iteration 9000 (142857 iter/s, 0.007s/1000 iters), loss = 0
I0314 20:15:20.230597  9415 solver.cpp:258]     Train net output #0: accuracy = 0.5
I0314 20:15:20.230604  9415 sgd_solver.cpp:112] Iteration 9000, lr = 0.001
I0314 20:15:20.236244  9415 solver.cpp:468] Snapshotting to binary proto file examples/hdf5_classification/data/train_iter_10000.caffemodel
I0314 20:15:20.236490  9415 sgd_solver.cpp:280] Snapshotting solver state to binary proto file examples/hdf5_classification/data/train_iter_10000.solverstate
I0314 20:15:20.236553  9415 solver.cpp:331] Iteration 10000, loss = 0
I0314 20:15:20.236580  9415 solver.cpp:351] Iteration 10000, Testing net (#0)
I0314 20:15:20.237339  9415 solver.cpp:418]     Test net output #0: accuracy = 0.4464
I0314 20:15:20.237366  9415 solver.cpp:336] Optimization Done.
I0314 20:15:20.237373  9415 caffe.cpp:250] Optimization Done.

如果你查看一下logreg_auto_train.prototxt输出的信息，你可以看到这个模型就只是个简单的逻辑回归。我们可以通过在输入权重和输出权重之间引入一个非线性单元来改进这个模型——现在我们有一个双层网络。那个网络定义在nonlinear_auto_train.prototxt中，这也是跟我们当前使用模型相比唯一的改动。
新的网络的最终准确率应该要比逻辑回归的高一些。

from caffe import layers as L
from caffe import params as P

# 就只是加了一个ReLU单元，其他都与前面逻辑回归模型一样
def nonlinear_net(hdf5, batch_size):
    # one small nonlinearity, one leap for model kind
    n = caffe.NetSpec()
    n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
    # define a hidden layer of dimension 40
    n.ip1 = L.InnerProduct(n.data, num_output=40, weight_filler=dict(type='xavier'))
    # transform the output through the ReLU (rectified linear) non-linearity
    n.relu1 = L.ReLU(n.ip1, in_place=True)
    # score the (now non-linear) features
    n.ip2 = L.InnerProduct(n.ip1, num_output=2, weight_filler=dict(type='xavier'))
    # same accuracy and loss as before
    n.accuracy = L.Accuracy(n.ip2, n.label)
    n.loss = L.SoftmaxWithLoss(n.ip2, n.label)
    return n.to_proto()

train_net_path = 'examples/hdf5_classification/nonlinear_auto_train.prototxt'
with open(train_net_path, 'w') as f:
    f.write(str(nonlinear_net('examples/hdf5_classification/data/train.txt', 10)))

test_net_path = 'examples/hdf5_classification/nonlinear_auto_test.prototxt'
with open(test_net_path, 'w') as f:
    f.write(str(nonlinear_net('examples/hdf5_classification/data/test.txt', 10)))

solver_path = 'examples/hdf5_classification/nonlinear_logreg_solver.prototxt'
with open(solver_path, 'w') as f:
    f.write(str(solver(train_net_path, test_net_path)))

%%timeit
caffe.set_mode_cpu()
solver = caffe.get_solver(solver_path)
solver.solve()

accuracy = 0
batch_size = solver.test_nets[0].blobs['data'].num
test_iters = int(len(Xt) / batch_size)
for i in range(test_iters):
    solver.test_nets[0].forward()
    accuracy += solver.test_nets[0].blobs['accuracy'].data
accuracy /= test_iters

print("Accuracy: {:.3f}".format(accuracy))

Accuracy: 0.837
Accuracy: 0.839
Accuracy: 0.838
Accuracy: 0.838
1 loop, best of 3: 210 ms per loop

在命令行下运行一下。

!./build/tools/caffe train -solver examples/hdf5_classification/nonlinear_logreg_solver.prototxt

I0314 20:20:55.911592 11035 caffe.cpp:197] Use CPU.
I0314 20:20:55.911850 11035 solver.cpp:45] Initializing solver from parameters: 
train_net: "examples/hdf5_classification/nonlinear_auto_train.prototxt"
test_net: "examples/hdf5_classification/nonlinear_auto_test.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: CPU
train_state {
  level: 0
  stage: ""
}
I0314 20:20:55.911954 11035 solver.cpp:92] Creating training net from train_net file: examples/hdf5_classification/nonlinear_auto_train.prototxt
I0314 20:20:55.912081 11035 net.cpp:51] Initializing net from parameters: 
state {
  phase: TRAIN
  level: 0
  stage: ""
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0314 20:20:55.912179 11035 layer_factory.hpp:77] Creating layer data
I0314 20:20:55.912197 11035 net.cpp:84] Creating Layer data
I0314 20:20:55.912206 11035 net.cpp:380] data -> data
I0314 20:20:55.912225 11035 net.cpp:380] data -> label
I0314 20:20:55.912235 11035 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I0314 20:20:55.912263 11035 hdf5_data_layer.cpp:94] Number of HDF5 files: 2
I0314 20:20:55.913141 11035 hdf5.cpp:32] Datatype class: H5T_FLOAT
I0314 20:20:55.913615 11035 net.cpp:122] Setting up data
I0314 20:20:55.913637 11035 net.cpp:129] Top shape: 10 4 (40)
I0314 20:20:55.913645 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.913650 11035 net.cpp:137] Memory required for data: 200
I0314 20:20:55.913660 11035 layer_factory.hpp:77] Creating layer label_data_1_split
I0314 20:20:55.913673 11035 net.cpp:84] Creating Layer label_data_1_split
I0314 20:20:55.913681 11035 net.cpp:406] label_data_1_split <- label
I0314 20:20:55.913693 11035 net.cpp:380] label_data_1_split -> label_data_1_split_0
I0314 20:20:55.913705 11035 net.cpp:380] label_data_1_split -> label_data_1_split_1
I0314 20:20:55.913714 11035 net.cpp:122] Setting up label_data_1_split
I0314 20:20:55.913720 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.913724 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.913728 11035 net.cpp:137] Memory required for data: 280
I0314 20:20:55.913733 11035 layer_factory.hpp:77] Creating layer ip1
I0314 20:20:55.913743 11035 net.cpp:84] Creating Layer ip1
I0314 20:20:55.913748 11035 net.cpp:406] ip1 <- data
I0314 20:20:55.913754 11035 net.cpp:380] ip1 -> ip1
I0314 20:20:55.914089 11035 net.cpp:122] Setting up ip1
I0314 20:20:55.914099 11035 net.cpp:129] Top shape: 10 40 (400)
I0314 20:20:55.914103 11035 net.cpp:137] Memory required for data: 1880
I0314 20:20:55.914119 11035 layer_factory.hpp:77] Creating layer relu1
I0314 20:20:55.914126 11035 net.cpp:84] Creating Layer relu1
I0314 20:20:55.914130 11035 net.cpp:406] relu1 <- ip1
I0314 20:20:55.914136 11035 net.cpp:367] relu1 -> ip1 (in-place)
I0314 20:20:55.914144 11035 net.cpp:122] Setting up relu1
I0314 20:20:55.914149 11035 net.cpp:129] Top shape: 10 40 (400)
I0314 20:20:55.914152 11035 net.cpp:137] Memory required for data: 3480
I0314 20:20:55.914156 11035 layer_factory.hpp:77] Creating layer ip2
I0314 20:20:55.914165 11035 net.cpp:84] Creating Layer ip2
I0314 20:20:55.914170 11035 net.cpp:406] ip2 <- ip1
I0314 20:20:55.914175 11035 net.cpp:380] ip2 -> ip2
I0314 20:20:55.914188 11035 net.cpp:122] Setting up ip2
I0314 20:20:55.914213 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.914217 11035 net.cpp:137] Memory required for data: 3560
I0314 20:20:55.914227 11035 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0314 20:20:55.914233 11035 net.cpp:84] Creating Layer ip2_ip2_0_split
I0314 20:20:55.914237 11035 net.cpp:406] ip2_ip2_0_split <- ip2
I0314 20:20:55.914244 11035 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0314 20:20:55.914252 11035 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0314 20:20:55.914260 11035 net.cpp:122] Setting up ip2_ip2_0_split
I0314 20:20:55.914265 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.914270 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.914273 11035 net.cpp:137] Memory required for data: 3720
I0314 20:20:55.914278 11035 layer_factory.hpp:77] Creating layer accuracy
I0314 20:20:55.914285 11035 net.cpp:84] Creating Layer accuracy
I0314 20:20:55.914289 11035 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0314 20:20:55.914294 11035 net.cpp:406] accuracy <- label_data_1_split_0
I0314 20:20:55.914300 11035 net.cpp:380] accuracy -> accuracy
I0314 20:20:55.914309 11035 net.cpp:122] Setting up accuracy
I0314 20:20:55.914315 11035 net.cpp:129] Top shape: (1)
I0314 20:20:55.914319 11035 net.cpp:137] Memory required for data: 3724
I0314 20:20:55.914324 11035 layer_factory.hpp:77] Creating layer loss
I0314 20:20:55.914330 11035 net.cpp:84] Creating Layer loss
I0314 20:20:55.914335 11035 net.cpp:406] loss <- ip2_ip2_0_split_1
I0314 20:20:55.914340 11035 net.cpp:406] loss <- label_data_1_split_1
I0314 20:20:55.914346 11035 net.cpp:380] loss -> loss
I0314 20:20:55.914355 11035 layer_factory.hpp:77] Creating layer loss
I0314 20:20:55.914369 11035 net.cpp:122] Setting up loss
I0314 20:20:55.914376 11035 net.cpp:129] Top shape: (1)
I0314 20:20:55.914378 11035 net.cpp:132]     with loss weight 1
I0314 20:20:55.914394 11035 net.cpp:137] Memory required for data: 3728
I0314 20:20:55.914399 11035 net.cpp:198] loss needs backward computation.
I0314 20:20:55.914407 11035 net.cpp:200] accuracy does not need backward computation.
I0314 20:20:55.914412 11035 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0314 20:20:55.914417 11035 net.cpp:198] ip2 needs backward computation.
I0314 20:20:55.914422 11035 net.cpp:198] relu1 needs backward computation.
I0314 20:20:55.914425 11035 net.cpp:198] ip1 needs backward computation.
I0314 20:20:55.914430 11035 net.cpp:200] label_data_1_split does not need backward computation.
I0314 20:20:55.914435 11035 net.cpp:200] data does not need backward computation.
I0314 20:20:55.914439 11035 net.cpp:242] This network produces output accuracy
I0314 20:20:55.914444 11035 net.cpp:242] This network produces output loss
I0314 20:20:55.914454 11035 net.cpp:255] Network initialization done.
I0314 20:20:55.914539 11035 solver.cpp:190] Creating test net (#0) specified by test_net file: examples/hdf5_classification/nonlinear_auto_test.prototxt
I0314 20:20:55.914587 11035 net.cpp:51] Initializing net from parameters: 
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "examples/hdf5_classification/data/test.txt"
    batch_size: 10
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0314 20:20:55.914672 11035 layer_factory.hpp:77] Creating layer data
I0314 20:20:55.914700 11035 net.cpp:84] Creating Layer data
I0314 20:20:55.914706 11035 net.cpp:380] data -> data
I0314 20:20:55.914716 11035 net.cpp:380] data -> label
I0314 20:20:55.914741 11035 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/test.txt
I0314 20:20:55.914759 11035 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0314 20:20:55.916326 11035 net.cpp:122] Setting up data
I0314 20:20:55.916354 11035 net.cpp:129] Top shape: 10 4 (40)
I0314 20:20:55.916359 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.916363 11035 net.cpp:137] Memory required for data: 200
I0314 20:20:55.916368 11035 layer_factory.hpp:77] Creating layer label_data_1_split
I0314 20:20:55.916379 11035 net.cpp:84] Creating Layer label_data_1_split
I0314 20:20:55.916383 11035 net.cpp:406] label_data_1_split <- label
I0314 20:20:55.916389 11035 net.cpp:380] label_data_1_split -> label_data_1_split_0
I0314 20:20:55.916398 11035 net.cpp:380] label_data_1_split -> label_data_1_split_1
I0314 20:20:55.916404 11035 net.cpp:122] Setting up label_data_1_split
I0314 20:20:55.916410 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.916417 11035 net.cpp:129] Top shape: 10 (10)
I0314 20:20:55.916421 11035 net.cpp:137] Memory required for data: 280
I0314 20:20:55.916427 11035 layer_factory.hpp:77] Creating layer ip1
I0314 20:20:55.916437 11035 net.cpp:84] Creating Layer ip1
I0314 20:20:55.916441 11035 net.cpp:406] ip1 <- data
I0314 20:20:55.916448 11035 net.cpp:380] ip1 -> ip1
I0314 20:20:55.916466 11035 net.cpp:122] Setting up ip1
I0314 20:20:55.916473 11035 net.cpp:129] Top shape: 10 40 (400)
I0314 20:20:55.916476 11035 net.cpp:137] Memory required for data: 1880
I0314 20:20:55.916487 11035 layer_factory.hpp:77] Creating layer relu1
I0314 20:20:55.916494 11035 net.cpp:84] Creating Layer relu1
I0314 20:20:55.916498 11035 net.cpp:406] relu1 <- ip1
I0314 20:20:55.916504 11035 net.cpp:367] relu1 -> ip1 (in-place)
I0314 20:20:55.916510 11035 net.cpp:122] Setting up relu1
I0314 20:20:55.916515 11035 net.cpp:129] Top shape: 10 40 (400)
I0314 20:20:55.916518 11035 net.cpp:137] Memory required for data: 3480
I0314 20:20:55.916523 11035 layer_factory.hpp:77] Creating layer ip2
I0314 20:20:55.916529 11035 net.cpp:84] Creating Layer ip2
I0314 20:20:55.916533 11035 net.cpp:406] ip2 <- ip1
I0314 20:20:55.916539 11035 net.cpp:380] ip2 -> ip2
I0314 20:20:55.916549 11035 net.cpp:122] Setting up ip2
I0314 20:20:55.916555 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.916558 11035 net.cpp:137] Memory required for data: 3560
I0314 20:20:55.916566 11035 layer_factory.hpp:77] Creating layer ip2_ip2_0_split
I0314 20:20:55.916573 11035 net.cpp:84] Creating Layer ip2_ip2_0_split
I0314 20:20:55.916575 11035 net.cpp:406] ip2_ip2_0_split <- ip2
I0314 20:20:55.916581 11035 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0314 20:20:55.916589 11035 net.cpp:380] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0314 20:20:55.916595 11035 net.cpp:122] Setting up ip2_ip2_0_split
I0314 20:20:55.916600 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.916605 11035 net.cpp:129] Top shape: 10 2 (20)
I0314 20:20:55.916609 11035 net.cpp:137] Memory required for data: 3720
I0314 20:20:55.916613 11035 layer_factory.hpp:77] Creating layer accuracy
I0314 20:20:55.916620 11035 net.cpp:84] Creating Layer accuracy
I0314 20:20:55.916623 11035 net.cpp:406] accuracy <- ip2_ip2_0_split_0
I0314 20:20:55.916628 11035 net.cpp:406] accuracy <- label_data_1_split_0
I0314 20:20:55.916635 11035 net.cpp:380] accuracy -> accuracy
I0314 20:20:55.916642 11035 net.cpp:122] Setting up accuracy
I0314 20:20:55.916647 11035 net.cpp:129] Top shape: (1)
I0314 20:20:55.916651 11035 net.cpp:137] Memory required for data: 3724
I0314 20:20:55.916656 11035 layer_factory.hpp:77] Creating layer loss
I0314 20:20:55.916661 11035 net.cpp:84] Creating Layer loss
I0314 20:20:55.916666 11035 net.cpp:406] loss <- ip2_ip2_0_split_1
I0314 20:20:55.916671 11035 net.cpp:406] loss <- label_data_1_split_1
I0314 20:20:55.916676 11035 net.cpp:380] loss -> loss
I0314 20:20:55.916683 11035 layer_factory.hpp:77] Creating layer loss
I0314 20:20:55.916694 11035 net.cpp:122] Setting up loss
I0314 20:20:55.916699 11035 net.cpp:129] Top shape: (1)
I0314 20:20:55.916721 11035 net.cpp:132]     with loss weight 1
I0314 20:20:55.916733 11035 net.cpp:137] Memory required for data: 3728
I0314 20:20:55.916738 11035 net.cpp:198] loss needs backward computation.
I0314 20:20:55.916743 11035 net.cpp:200] accuracy does not need backward computation.
I0314 20:20:55.916749 11035 net.cpp:198] ip2_ip2_0_split needs backward computation.
I0314 20:20:55.916752 11035 net.cpp:198] ip2 needs backward computation.
I0314 20:20:55.916756 11035 net.cpp:198] relu1 needs backward computation.
I0314 20:20:55.916760 11035 net.cpp:198] ip1 needs backward computation.
I0314 20:20:55.916765 11035 net.cpp:200] label_data_1_split does not need backward computation.
I0314 20:20:55.916770 11035 net.cpp:200] data does not need backward computation.
I0314 20:20:55.916774 11035 net.cpp:242] This network produces output accuracy
I0314 20:20:55.916779 11035 net.cpp:242] This network produces output loss
I0314 20:20:55.916790 11035 net.cpp:255] Network initialization done.
I0314 20:20:55.916816 11035 solver.cpp:57] Solver scaffolding done.
I0314 20:20:55.916831 11035 caffe.cpp:239] Starting Optimization
I0314 20:20:55.916836 11035 solver.cpp:293] Solving 
I0314 20:20:55.916839 11035 solver.cpp:294] Learning Rate Policy: step
I0314 20:20:55.916855 11035 solver.cpp:351] Iteration 0, Testing net (#0)
I0314 20:20:55.919618 11035 solver.cpp:418]     Test net output #0: accuracy = 0.4108
I0314 20:20:55.919648 11035 solver.cpp:418]     Test net output #1: loss = 0.855203 (* 1 = 0.855203 loss)
I0314 20:20:55.919705 11035 solver.cpp:239] Iteration 0 (-5.64912e-35 iter/s, 0.002s/1000 iters), loss = 0.722532
I0314 20:20:55.919716 11035 solver.cpp:258]     Train net output #0: accuracy = 0.5
I0314 20:20:55.919725 11035 solver.cpp:258]     Train net output #1: loss = 0.722532 (* 1 = 0.722532 loss)
I0314 20:20:55.919735 11035 sgd_solver.cpp:112] Iteration 0, lr = 0.01
I0314 20:20:55.936062 11035 solver.cpp:351] Iteration 1000, Testing net (#0)
I0314 20:20:55.939965 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8032
I0314 20:20:55.940037 11035 solver.cpp:418]     Test net output #1: loss = 0.438563 (* 1 = 0.438563 loss)
I0314 20:20:55.940106 11035 solver.cpp:239] Iteration 1000 (50000 iter/s, 0.02s/1000 iters), loss = 0.264288
I0314 20:20:55.940132 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:55.940150 11035 solver.cpp:258]     Train net output #1: loss = 0.264287 (* 1 = 0.264287 loss)
I0314 20:20:55.940163 11035 sgd_solver.cpp:112] Iteration 1000, lr = 0.01
I0314 20:20:55.959853 11035 solver.cpp:351] Iteration 2000, Testing net (#0)
I0314 20:20:55.963670 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8168
I0314 20:20:55.963881 11035 solver.cpp:418]     Test net output #1: loss = 0.424416 (* 1 = 0.424416 loss)
I0314 20:20:55.964099 11035 solver.cpp:239] Iteration 2000 (43478.3 iter/s, 0.023s/1000 iters), loss = 0.353603
I0314 20:20:55.964164 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:55.964211 11035 solver.cpp:258]     Train net output #1: loss = 0.353603 (* 1 = 0.353603 loss)
I0314 20:20:55.964269 11035 sgd_solver.cpp:112] Iteration 2000, lr = 0.01
I0314 20:20:55.980206 11035 solver.cpp:351] Iteration 3000, Testing net (#0)
I0314 20:20:55.982419 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8248
I0314 20:20:55.982451 11035 solver.cpp:418]     Test net output #1: loss = 0.406504 (* 1 = 0.406504 loss)
I0314 20:20:55.982494 11035 solver.cpp:239] Iteration 3000 (55555.6 iter/s, 0.018s/1000 iters), loss = 0.443583
I0314 20:20:55.982506 11035 solver.cpp:258]     Train net output #0: accuracy = 0.8
I0314 20:20:55.982514 11035 solver.cpp:258]     Train net output #1: loss = 0.443583 (* 1 = 0.443583 loss)
I0314 20:20:55.982520 11035 sgd_solver.cpp:112] Iteration 3000, lr = 0.01
I0314 20:20:56.006434 11035 solver.cpp:351] Iteration 4000, Testing net (#0)
I0314 20:20:56.008671 11035 solver.cpp:418]     Test net output #0: accuracy = 0.824
I0314 20:20:56.008695 11035 solver.cpp:418]     Test net output #1: loss = 0.393197 (* 1 = 0.393197 loss)
I0314 20:20:56.008750 11035 solver.cpp:239] Iteration 4000 (38461.5 iter/s, 0.026s/1000 iters), loss = 0.188689
I0314 20:20:56.008761 11035 solver.cpp:258]     Train net output #0: accuracy = 1
I0314 20:20:56.008767 11035 solver.cpp:258]     Train net output #1: loss = 0.188689 (* 1 = 0.188689 loss)
I0314 20:20:56.008772 11035 sgd_solver.cpp:112] Iteration 4000, lr = 0.01
I0314 20:20:56.022996 11035 solver.cpp:351] Iteration 5000, Testing net (#0)
I0314 20:20:56.027575 11035 solver.cpp:418]     Test net output #0: accuracy = 0.818
I0314 20:20:56.027643 11035 solver.cpp:418]     Test net output #1: loss = 0.414865 (* 1 = 0.414865 loss)
I0314 20:20:56.027710 11035 solver.cpp:239] Iteration 5000 (55555.6 iter/s, 0.018s/1000 iters), loss = 0.329345
I0314 20:20:56.027729 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:56.027740 11035 solver.cpp:258]     Train net output #1: loss = 0.329345 (* 1 = 0.329345 loss)
I0314 20:20:56.027751 11035 sgd_solver.cpp:112] Iteration 5000, lr = 0.001
I0314 20:20:56.049803 11035 solver.cpp:351] Iteration 6000, Testing net (#0)
I0314 20:20:56.052284 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8316
I0314 20:20:56.052378 11035 solver.cpp:418]     Test net output #1: loss = 0.387805 (* 1 = 0.387805 loss)
I0314 20:20:56.052443 11035 solver.cpp:239] Iteration 6000 (41666.7 iter/s, 0.024s/1000 iters), loss = 0.413111
I0314 20:20:56.052459 11035 solver.cpp:258]     Train net output #0: accuracy = 0.8
I0314 20:20:56.052469 11035 solver.cpp:258]     Train net output #1: loss = 0.413112 (* 1 = 0.413112 loss)
I0314 20:20:56.052479 11035 sgd_solver.cpp:112] Iteration 6000, lr = 0.001
I0314 20:20:56.069612 11035 solver.cpp:351] Iteration 7000, Testing net (#0)
I0314 20:20:56.072029 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8328
I0314 20:20:56.072068 11035 solver.cpp:418]     Test net output #1: loss = 0.385674 (* 1 = 0.385674 loss)
I0314 20:20:56.072096 11035 solver.cpp:239] Iteration 7000 (52631.6 iter/s, 0.019s/1000 iters), loss = 0.200515
I0314 20:20:56.072106 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:56.072114 11035 solver.cpp:258]     Train net output #1: loss = 0.200516 (* 1 = 0.200516 loss)
I0314 20:20:56.072119 11035 sgd_solver.cpp:112] Iteration 7000, lr = 0.001
I0314 20:20:56.086441 11035 solver.cpp:351] Iteration 8000, Testing net (#0)
I0314 20:20:56.088604 11035 solver.cpp:418]     Test net output #0: accuracy = 0.828
I0314 20:20:56.088636 11035 solver.cpp:418]     Test net output #1: loss = 0.390957 (* 1 = 0.390957 loss)
I0314 20:20:56.088660 11035 solver.cpp:239] Iteration 8000 (62500 iter/s, 0.016s/1000 iters), loss = 0.283519
I0314 20:20:56.088670 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:56.088675 11035 solver.cpp:258]     Train net output #1: loss = 0.28352 (* 1 = 0.28352 loss)
I0314 20:20:56.088680 11035 sgd_solver.cpp:112] Iteration 8000, lr = 0.001
I0314 20:20:56.102646 11035 solver.cpp:351] Iteration 9000, Testing net (#0)
I0314 20:20:56.104756 11035 solver.cpp:418]     Test net output #0: accuracy = 0.8336
I0314 20:20:56.104782 11035 solver.cpp:418]     Test net output #1: loss = 0.385762 (* 1 = 0.385762 loss)
I0314 20:20:56.104802 11035 solver.cpp:239] Iteration 9000 (62500 iter/s, 0.016s/1000 iters), loss = 0.406498
I0314 20:20:56.104811 11035 solver.cpp:258]     Train net output #0: accuracy = 0.9
I0314 20:20:56.104816 11035 solver.cpp:258]     Train net output #1: loss = 0.406498 (* 1 = 0.406498 loss)
I0314 20:20:56.104820 11035 sgd_solver.cpp:112] Iteration 9000, lr = 0.001
I0314 20:20:56.122784 11035 solver.cpp:468] Snapshotting to binary proto file examples/hdf5_classification/data/train_iter_10000.caffemodel
I0314 20:20:56.123262 11035 sgd_solver.cpp:280] Snapshotting solver state to binary proto file examples/hdf5_classification/data/train_iter_10000.solverstate
I0314 20:20:56.123457 11035 solver.cpp:331] Iteration 10000, loss = 0.193391
I0314 20:20:56.123478 11035 solver.cpp:351] Iteration 10000, Testing net (#0)
I0314 20:20:56.126117 11035 solver.cpp:418]     Test net output #0: accuracy = 0.832
I0314 20:20:56.126152 11035 solver.cpp:418]     Test net output #1: loss = 0.383889 (* 1 = 0.383889 loss)
I0314 20:20:56.126158 11035 solver.cpp:336] Optimization Done.
I0314 20:20:56.126163 11035 caffe.cpp:250] Optimization Done.

# Clean up (comment this out if you want to examine the hdf5_classification/data directory).
shutil.rmtree(dirname)

Caffe官方教程翻译（8）：Brewing Logistic Regression then Going Deeper

前言

Brewing Logistic Regression then Going Deeper

猜你喜欢