TensorFlow-deep learning foundation and tf.keras

tf.keras-core high-level API

Mean square error: (f(x)-y) 2 /n

Sequential model—Sequential: one input and one output
layer: Dense

CSV (Comma Separated Value File Format)

  • Sometimes called a character-separated value, because the delimiter may not be a comma;
  • Store table data (numbers and text) in plain text;
  • The CSV file consists of any number of records, separated by some kind of newline character;
  • Each record is composed of fields, and the separators between fields are other characters or strings, the most common ones are commas or tabs;
  • Do not leave blank at the beginning, in units of behavior;
  • The column name can be included or not, and the column name is included in the first line of the file.

Example: predict wages based on years of education

Jupyter Notebook

import tensorflow as tf # 引入tensorflow框架并简写为tf
print('TensorFlow Version:{}'.format(tf.__version__)) # 打印查看tensorflow的版本号
import pandas as pd # 引入pandas并简记为pd
data = pd.read_csv('Income1.csv') #将文件中的数据读取到DataFrame中
data
import matplotlib.pyplot as plt # 引入matplotlib绘图,简记为plt
%matplotlib inline # 直接在python console中生成图像
plt.scatter(data.Education,data.Income) # 根据x,y数据绘制散点图
x=data.Education
y=data.Income
model = tf.keras.Sequential() # 选择模型Sequential:层的线性叠加
model.add(tf.keras.layers.Dense(1,input_shape=(1,))) # 添加层
model.summary()# 模型结构
model.compile(optimizer='adam',loss='mse')# optimizer:优化方法  loss:损失函数
history = model.fit(x,y,epochs=5000) # epochs:所有数据的训练次数
model.predict(x) # 预测
model.predict(pd.Series([20])) # 预测

Multilayer perceptron

Insert picture description here

Perceptrons (referring to single-layer perceptrons) have certain limitations-they cannot solve the XOR problem , that is, the problem of inseparable linearity.
Combine multiple single-layer perceptrons to get a multi-layer perceptron (MLP-Multi-Layer Perceptron) structure. The multilayer perceptron consists of an input layer, one or more hidden layers, and an output layer . The neurons in each layer are fully connected to the next layer.
If the network contains more than one hidden layer , it is called a deep artificial neural network.
Description:

  • Usually we say that the layer of the neural network refers to the layer with calculations, because the input layer is not calculated, therefore, usually the input layer is not counted in the layer of the neural network.
  • Multilayer perceptrons (deep neural networks) can solve linear inseparability problems.

Activation function

In a neural network, the activation function is used to define the output for each node (neuron), which can be used as the input of the next node (neuron).
Role: The activation function provides the nonlinear modeling capability of the network. If the activation function is not used, even a multilayer neural network cannot solve the linear inseparability problem.
** Common activation functions: **

  • Step function

  • sigmoid function
    Insert picture description here

  • tanh function
    Insert picture description here

  • relu function
    Insert picture description here

  • Leak relu
    Insert picture description here

Example: Forecast sales based on promotional channels

Jupyter Notebook

import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

data = pd.read_csv("Advertising.csv")
data.head() # 打印前5行数据
plt.scatter(data.TV,data.sales) # 绘制散点图
plt.scatter(data.radio,data.sales)
plt.scatter(data.newspaper,data.sales)
x=data.iloc[:,1:-1] # 除去第一列和最后一列取所有数据
y=data.iloc[:,-1] # 取所有数据的最后一列
model = tf.keras.Sequential([tf.keras.layers.Dense(10,input_shape=(3,),activation='relu'),tf.keras.layers.Dense(1)]) # 线性模型,添加中间隐藏单元数为10个,输入的维度为3维,激活函数为relu,输出维度为1维
model.summary() # 查看网络结构
model.compile(optimizer='adam',loss='mse') # 配置训练模型
model.fit(x,y,epochs=100)  # 训练模型
test = data.iloc[:10,1:-1] # 取10行除去第一列和最后一列的数
model.predict(test) # 使用模型对数据进行预测

Logistic regression

Logistic regression gives "yes" and "no" answers.
Use the sigmoid function as the loss function. The sigmoid is a probability distribution function and the output is a probability value.

Example

Insert picture description hereInsert picture description hereInsert picture description here

Cross entropy

Characterize the distance between the actual output (probability) and the expected output (probability).

softmax multi-category

  • Softmax requires that each sample must belong to a certain category, and all possible samples are covered
  • The sum of the sample components is 1
  • When there are only two categories, the calculation result is the same as logarithmic regression
  • The loss function of multi-classification problems uses categorical_crossentropy<label to read hot coding> and sparse_categorical_crossentropy<label to use sequential coding> to calculate cross entropy

Fashion MINIST data set

The handwritten digital image data set contains 70,000 gray-scale images, covering 10 categories.

Learning rate (hyperparameter)

Manual configuration parameters
Appropriate learning rate, loss of function with time declined , inappropriate learning rate, loss of function may occur shock

Backpropagation algorithm

A technique for efficiently calculating the gradient in a data flow graph. The derivative of each layer is the product of the derivative of the next layer and the output of the previous layer .

Network capacity

  • It can be considered as proportional to the trainable parameters in the network.
  • The more neural units in the network and the more layers, the stronger the fitting ability of the neural network. But the greater the training speed and difficulty, the more likely it is to overfit.
  • Increasing network capacity can increase the number of intermediate hidden layers and hidden neurons

Improve network fitting ability

  • Simply increasing the number of neurons is not obvious for improving network performance
  • Adding layers will greatly improve the fitting ability of the network
    note:
    The number of neurons in a single layer cannot be too small. Too small will cause information bottlenecks and make the model under-fitting.

Overfitting

The loss value does not fall but rises
. The score is high on the training data, and the score is low on the test data.
Suppress overfitting:Increase training data, reduce network capacity (the best way)
1. Dropout
2. Regularization
3. Image enhancement

Underfitting

Low score on training data, low score on test data

Cross-validation

Insert picture description here

ExampleInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description hereInsert picture description here

Functional API

Multiple input and multiple output models can be implemented

Example

from tensorflow import keras
import matplotlib.pyplot as plt
%matplotlib inline
(train_image,train_label),(test_image,test_label) = tf.keras.datasets.fashion_mnist.load_data() # 加载fashion_mnist数据
train_image = train_image/255 # 归一化
test_image = test_image/255
#设置输入
input = keras.Input(shape=(28*28))
#调用Flatten,将Flatten看成一个函数
x = keras.layers.Flatten()(input)
x = keras.layers.Dense(32,activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
x = keras.layers.Dense(64,activation='relu')(x)
output = keras.layers.Dense(10,activation='softmax')(x) # 输出层
model = keras.Model(inputs = input,outputs = output) # 依据指定的输入输出初始化模型
model.summary()
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
histroy = model.fit(train_image,
                   train_label,
                   epochs=30,
                   validation_data=(test_image,test_label))

Guess you like

Origin blog.csdn.net/qq_35134206/article/details/109130789