Keras Deep Learning - House Price Prediction Using Neural Networks

Get into the habit of writing together! This is the 18th day of my participation in the "Nuggets Daily New Plan · April Update Challenge", click to view the details of the event .

house price forecast

In this section, we will study the continuous output problem by trying to predict the price of a house. Using the Bostonhouse price dataset, take as input 13 variables that may affect house prices. Our aim is to minimize the error in our predictions of home prices. Explore practical applications of neural networks.

Since the goal is to minimize the error, we define the error to be minimized - the absolute error, or also the squared error. Now that we have an objective to optimize, let's define a strategy for solving this problem:

  • Normalize the input dataset, scaling the range of all variables to be between 0-1.
  • Split the given data into training and testing datasets.
  • The model uses a hidden layer
  • Compile the model using the Adam optimizer and define the loss function as the mean absolute error value.
  • Fit the model.
  • Make predictions on the test dataset.
  • Calculate the error in predictions on the test dataset.

We have defined the model method, next, we implement it in the code of .

  1. Import the relevant datasets:
from keras.datasets import boston_housing
import numpy as np
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
复制代码
  1. Normalize the input and output datasets so that all variables range from zero to one:
max_target = np.max(train_targets)
train_data2 = train_data/np.max(train_data,axis=0)
test_data2 = test_data/np.max(train_data,axis=0)
train_targets = train_targets/max_target
test_targets = test_targets/max_target
复制代码

We normalize the test dataset with the maximum value in the training dataset, as no values ​​in the test dataset should be used during model building.

  1. With the input and output datasets ready, define the model:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.utils import np_utils
from keras.regularizers import l1

model = Sequential()
model.add(Dense(64, input_dim=13, activation='relu', kernel_regularizer=l1(0.1)))
model.add(Dense(1, activation='relu', kernel_regularizer=l1(0.1)))
model.summary()
复制代码

The brief information output of the model is as follows:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #  
=================================================================
dense (Dense)                (None, 64)                896      
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 65       
=================================================================
Total params: 961
Trainable params: 961
Non-trainable params: 0
_________________________________________________________________
复制代码

We performed L1regularization so the model does not overfit (the number of data points in the training data is small).

  1. Compile the model to minimize the mean absolute error value:
model.compile(loss='mean_absolute_error', optimizer='adam')
复制代码
  1. Finally, fit the model:
history = model.fit(train_data2,
                    train_targets,
                    validation_data=(test_data2, test_targets),
                    epochs=100,
                    batch_size=64,
                    verbose=1)
复制代码
  1. Calculate and print the mean absolute error on the test dataset:
print(np.mean(np.abs(model.predict(test_data2) - test_targets))*50)
复制代码

It can be seen that the mean absolute error is approx 7.7.

7.670271360777928
复制代码

In the next section, you will learn to use a custom loss function to further reduce the mean absolute error value.

Use a custom loss function

In the previous section, we used a predefined mean absolute error loss function to perform objective optimization. In this section, we will learn to define a custom loss function using .

The custom loss function we will use is a modified mean squared error value, where the error is the difference between the square root of the actual value and the square root of the predicted value. The custom loss function is defined as follows:

def loss_function(y_true, y_pred):
    return K.square(K.sqrt(y_pred) - K.sqrt(y_true))
复制代码

Next, compile the model using the same input and output dataset and neural network model as in the previous section using a custom loss function during fitting:

model.compile(loss=loss_function, optimizer='adam')
复制代码

In the preceding code, we define the loss function as a custom loss function loss_function, finally fit the model, and calculate the loss value of the model on the test dataset.

history = model.fit(train_data2, train_targets,
                    validation_data=(test_data2, test_targets),
                    epochs=100,
                    batch_size=64,
                    verbose=1)
print(np.mean(np.abs(model.predict(test_data2) - test_targets))*50)
复制代码

After fitting the model, you can see that the mean absolute error is approx 6.6., which is a bit smaller than using the mean_absolute_errorloss function:

6.566271535383652
复制代码

Guess you like

Origin juejin.im/post/7088693496931958797