Deep Learning with PyTorch

Interbank white, recently read the December 2019 official website Pytorch depth study of open source book "Deep Learning with PyTorch", summed up slightly for everyone to share, please also expressed inappropriate criticism, thanks.

A, Pytorch Introduction
1.1 Pytorch features: flexible, based on python, tensor.
1.2 Pytorch promote deep study provides libraries:
torch.nn (build neural network)
torch.util.data (data download process, and comprising a Dataset / DataLoader)
torch.nn.DataParallel and torch.distributed (the CPU, the GPU or computing and )
torch.optim (optimizer)
torch.storage (tensor storage)

Second, the tensor
2.1 PyTorch amount is the most basic data structures.
The difference between 2.2 mass storage and Python lists or tuples shown below:
Python object (boxed) numeric values versus tensor (unboxed array) numeric values
. Python Lists or tuples are Individually allocated in Memory, PyTorch tensors or NumPy Arrays are views over contiguous Memory
2.3 tensor slice
(1) is still a tensor
( 2) not to redistribute a new block of memory to store the results of the slice, but a view based on the result of the original storage.
2.4 amount stored
in a continuous block of memory management torch.Storage example, the variables are stored as a one-dimensional numeric array comprises a contiguous memory block. Shown below:
Tensors are views over a Storage instance
However, by using a continuous process (points.is_contiguous ()) can be stored in a new continuous tensor, this new tensor has upset the stored original order tensor elements, and according to the new tensor element row pave stored in memory.
2.5 Size (Size), the offset is stored (storage offset) and step (Strides)
Size (Size) is a tuple of the number of elements of each dimension tensor;
memory offset (storage offset) is a memory the first element and the corresponding tensor index;
step (Strides) is to get the next element along each dimension of the number of elements needs to be skipped in storage. FIG follows:
Relationship among a tensor’s offset, size, and stride
2.6 tensor retrieval
formulas: comprising left and right does not.
As some_list [1: 4], is returned some_list first to third element (fourth element is not returned).
2.7 NumPy mutual conversion amount
tensor can be converted to and from Numpy type, it is noted that when converting to a tensor NumPy, Numpy shared return raw tensor underlying buffer, thus changing Numpy lead to a change of the original tensor.

3 forms tensor sequence, text, image data representing
(1) the type of data required for the neural network is a multi-dimensional numerical tensor, typically 32-bit floating-point number.
(2) table data can be directly converted tensor data.
(3) may pass the text or data classification dictionary using a key encoder, the text data embedding techniques may also be used words (words from a key code to the embedding process, my understanding that the dimension reduction).
(4) Image data can have one or more channels, the most common is the red, green and blue channels.

4 learning mechanism
4.1 a few basic concepts:
learning rate (learning_rate): used to measure the direction of the gradient descent, each parameter value of the number of updates.
epoch: means to carry out an iterative training process for all training samples updated parameters.
4.2 General Procedure:
Definition Model -> initialization parameter (requires_grad = True) -> call model -> the loss function -> call backward (). Shown below:
the learning process
(1) Backward call () parameter gradient value will accumulate in the leaf node, it is necessary to display the gradient to zero after each parameter update, i.e. before the first optimizer.zero_grad = True requires_grad ()
(2 ) training set and validation set overfitting
4.3 convex optimization mechanism applied to a linear model, but can not be generalized to the neural network. This chapter inquiry learning mechanism is based on a linear model, namely the process parameter estimation.

5 fitting data using neural network
neural network training parameter estimation and looks a bit like, but their rationale is entirely different.
Parameter estimation is assumed that a particular family of functions belonging to unknown parameters of the model (e.g., the unknown linear model w and b), by learning to find a suitable w and b, i.e., the estimated parameters w and b. And a combination of linear behavior of the non-linear behavior and neural network can approximate any non-linear configuration functions, i.e. the neural network model is equivalent to assuming a family of functions (both hyper parameters have parameters, hyper parameters determine the specific model type, a particular model parameter determination type specific parameters). (Personal understanding)

Published an original article · won praise 0 · Views 58

Guess you like

Origin blog.csdn.net/danshuitian/article/details/104491446
Recommended