Some summaries and tips of the deep learning framework Lasagne

I have also used a few deep learning frameworks, and I have always liked Lasagne, because its design philosophy is not to avoid the bottom layer (theano), and the encapsulation is more flexible, unlike Keras, which has a complete set of its own logic, and the underlying excuses are exposed. Too few, it is very laborious to use keras when you want to make your own model.

In a sense, I personally think that Lasagne is not really a framework for neural networks, but a very good toolbox for theano, and its entire design is to make theano better to use.

Due to the changeable learning time, sometimes I will concentrate on writing code for a period of time, and sometimes I will concentrate on reading paper, so I often forget some logic and rules in lasagne. This time I will summarize some here from time to time, as A tip for myself. So in fact, many skills and experience are theano's experience, and write them together.

  • When declaring network input variables, you need to design attributes (vector or matrix or 4d, etc.) in advance, as follows:
    c = T.imatrix()
    q = T.ivector()
    y = T.imatrix()
    c_pe = T.tensor4()
    q_pe = T.tensor4()

    The determination of attributes needs to take into account the dimension of batch. Therefore, if a single input is a vector, it is a matrix when it is declared.
    In addition, the network output target y is a matrix of [batch_size, n_classes].

  • Label conversion problem, under normal circumstances, for labels that need to be converted into one-hot vectors,
     there is a convenient tool 
    from sklearn.preprocessing import LabelBinarizer,label_binarize

    Example:

    >>> label_binarize([1, 6], classes=[1, 6, 4, 2])
    array([[1, 0, 0, 0],
           [0, 1, 0, 0]])

 
  Another simple statement to count the error rate is 

numpy.count_nonzero(y_true-y_predict)
  • Regarding the problem of input samples in the theano modeling process,
    the usual form of modeling in theano is to use x=T.matrix(), y=T.tensor(), and the type of this variable is: theanoTensor.
    This is It is required that we do not have access to the actual data in the modeling process, but replace it with a variable abstraction.

    In fact, the input forms in theano's functions are various, typically:
    (1)
    train_model=theano.function([x,y],cost,updates=updates)
    cost=train_model(x_batch,y_batch)

    (2) The first one does not have too many tricks. For the second one, we can directly replace it with theanoSharedTensor variable without declaring the variable. For example: y_shared=theano.shared(np.zeros((batch_size,1),dtype=np.int32),borrow=True) can be initialized, and y_shared is used instead of y in the modeling process. The advantage of this is that it is more intuitive. And there is no process of givens={x:x_shared,y:y_shared}.
    givens = {
    x: x_shared,
    y: y_shared,
    }
    train_model=theano.function([],cost,givens=givens,updates=updates)
    x_shared.set_value(x_batch)
    y_shared.set_value(y_batch)
    cost=train_model()





  • Points to note about the use of theano.function

    When theano.function builds the model, the input is a list, even if there is no input.
    For example, cc= theano.function(Inputs=[aa,bb],outputs=[cost])
    but when calling, it is not necessary to send it in the form of a list.
     
     dd=cc(a1,b1)[0] .
     
    You must pay attention to this point, otherwise it is easy to transmit variable errors, but the error message will not be prompted, only the error will be prompted when importing data into the model and specific surface errors will be prompted.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325918660&siteId=291194637