R language black-box approach - Neural Network

** modeled on the strength of the strength of concrete using artificial neural networks ----

Step 1: Exploring and preparing the data ---- explore and prepare the data

The following described components of the mixture has eight features
Here Insert Picture Description

read in data and examine structure reads data

concrete <- read.csv("F:\\rwork\\Machine Learning with R (2nd Ed.)\\Chapter 07\\concrete.csv")
str(concrete)

custom normalization function
to write a function to convert data between 0-1

normalize <- function(x) { 
  return((x - min(x)) / (max(x) - min(x)))}

apply normalization to entire data frame
applied to the above data processing functions

concrete_norm <- as.data.frame(lapply(concrete, normalize))

Check if the conversion is complete
confirm that the range is now between zero and one

summary(concrete_norm$strength)

compared to the original minimum and maximum

summary(concrete$strength)

create training and test data as part of the training data, as part of the test data

concrete_train <- concrete_norm[1:773, ]
concrete_test <- concrete_norm[774:1030, ]

Step 2: Training a model on the data ---- Trainer

train the neuralnet model

#install.packages('neuralnet')#如果没有安装记得安装一下哦~
library(neuralnet)

simple ANN with only a single hidden neuron
before the single easiest multi-layer feed-forward network hidden nodes

set.seed(12345) # to guarantee repeatable results
concrete_model <- neuralnet(formula = strength ~ cement + slag +
                              ash + water + superplastic + 
                              coarseagg + fineagg + age,
                              data = concrete_train)

visualize the network topology
network topology visualization

plot(concrete_model)

Here Insert Picture Description

For each feature in the feature it has eight input node, followed by a hidden node and single output node a concrete prediction single.
Bias term (bias term) has also been delineated (via node 1 with digital representation).
Bottom of the figure, R and reported number of steps and training square error (SSE), the lower mean better prediction performance SSE

Step 3: Evaluating model performance ----评估模型性能

网络拓扑结构图让我们窥视了人工神经网络的黑箱,但是并没有提供更多关于模型拟合未来数据好坏的消息。
为了生成关于测试数据集的预测值,进行如下操作:
obtain model results

model_results <- compute(concrete_model, concrete_test[1:8])

compute和predict的运算原理不一样,它会返回一个带有两个分量的列表:
$neurons,用来储存网络中每一层的神经元;
$net.result,用来储存预测值

obtain predicted strength values

predicted_strength <- model_results$net.result#获得预测值

examine the correlation between predicted and actual values
计算预测值与真实值之间的相关性

cor(predicted_strength, concrete_test$strength)

相关性接近1表示两个变量之间具有很强的线性关系。因此,这里面大约为0.806的相关性表示具有一个相当强的线性关系。这意味着计数只有一个单一的隐藏点,我们的模型也做了相当不错的工作。
考虑到只用了一个隐藏点,因此我们模型的性能很可能可以提高。我们试着建立一个更好的模型。

Step 4: Improving model performance ----提高模型性能

a more complex neural network topology with 5 hidden neurons
这里我们将隐藏节点增加到5

set.seed(12345) # to guarantee repeatable results
concrete_model2 <- neuralnet(strength ~ cement + slag +
                               ash + water + superplastic + 
                               coarseagg + fineagg + age,
                               data = concrete_train, hidden = 5)

plot the network

plot(concrete_model2)

Here Insert Picture Description

从图中我们可以发现SSE已经从原先的5.08降到了1.63。此外,训练步数也从4882增加到了86849,考虑到现在的模型已经变得多复杂,这也就不足为奇了。越复杂的神经网络需要越多的跌倒来找到最优的权重。

evaluate the results as we did before

model_results2 <- compute(concrete_model2, concrete_test[1:8])
predicted_strength2 <- model_results2$net.result
cor(predicted_strength2, concrete_test$strength)

Using the same procedure on the predicted value and the true value, and now we have obtained a correlation coefficient of about 0.92, compared to a single point before have hidden a result of 0.80, which is a considerable improvement.

Please correct me oh ~~~ (whisper mailbox data needed oh)

Released two original articles · won praise 3 · Views 149

Guess you like

Origin blog.csdn.net/qq_44658157/article/details/103937339