Python predicts Tesla stock based on LSTM

Python predicts Tesla stock based on LSTM

提示:前言
Python predicts Tesla stock based on LSTM

Stock forecasting refers to the behavior of forecasting the future development direction of the stock market and the degree of ups and downs based on the development of the stock market by securities analysts who have a deep understanding of the stock market. This predictive behavior is only based on the assumed factors as the established preconditions.

The full name of LSTM is Long Short Term Memory. As the name suggests, it is a neural network with the ability to memorize long-term and short-term information. LSTM was first proposed by Hochreiter & Schmidhuber [1] in 1997. Due to the rise of deep learning in 2012, LSTM has gone through several generations of big cows (Felix Gers, Fred Cummins, Santiago Fernandez, Justin Bayer, Daan Wierstra, Julian Togelius, Faustino Gomez, Matteo Gagliolo, and Alex Gloves), thus forming a relatively systematic and complete LSTM framework, which has been widely used in many fields. This article focuses on LSTMs in the deep learning era.


提示:写完文章后,目录可以自动生成,如何生成可参考右边的帮助文档


foreword


提示:以下是本篇文章正文内容,下面案例可供参考

1. Import package

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
plt.style.use("fivethirtyeight")
%matplotlib inline
              
from datetime import datetime

2. Download data

Download Tesla stock

df=pd.read_csv("/tesla-inc-tsla-stock-price/TSLA.csv")
TESLA=df

company_list = [TESLA]
company_name = ["TESLA"]

for company, com_name in zip(company_list, company_name):
    company["company_name"] = com_name
    
df = pd.concat(company_list, axis=0)
df.tail(10)

insert image description here

descriptive statistics

df.describe()

insert image description here

df.info()

insert image description here

df.columns

insert image description here
stock price chart

plt.figure(figsize=(15, 10))
plt.subplots_adjust(top=1.25, bottom=1.2)

for i, company in enumerate(company_list, 1):
    plt.subplot(2, 2, i)
    company['Adj Close'].plot()
    plt.ylabel('Adj Close')
    plt.xlabel(None)
    plt.title("Closing Price of TESLA")
  
    
plt.tight_layout()

insert image description here
Volume Chart

# Now let's plot the total volume of stock being traded each day
plt.figure(figsize=(15, 10))
plt.subplots_adjust(top=1.25, bottom=1.2)

for i, company in enumerate(company_list, 1):
    plt.subplot(2, 2, i)
    company['Volume'].plot()
    plt.ylabel('Volume')
    plt.xlabel(None)
    plt.title("Sales Volume for TESLA")
    
plt.tight_layout()

insert image description here

3. Structure technical indicators

ma_day = [10, 20, 50]

for ma in ma_day:
    for company in company_list:
        column_name = f"MA for {
      
      ma} days"
        company[column_name] = company['Adj Close'].rolling(ma).mean()
        

fig, axes = plt.subplots(nrows=2, ncols=2)
fig.set_figheight(10)
fig.set_figwidth(15)

TESLA[['Adj Close', 'MA for 10 days', 'MA for 20 days', 'MA for 50 days']].plot(ax=axes[0,0])
axes[0,0].set_title('TESLA STOCK PRICE')
fig.tight_layout()

insert image description here
Yield Chart

# We'll use pct_change to find the percent change for each day
for company in company_list:
    company['Daily Return'] = company['Adj Close'].pct_change()

# Then we'll plot the daily return percentage
fig, axes = plt.subplots(nrows=2, ncols=2)
fig.set_figheight(10)
fig.set_figwidth(15)

TESLA['Daily Return'].plot(ax=axes[0,0], legend=True, linestyle='--', marker='o')
axes[0,0].set_title('TESLA STOCK PRICE')
fig.tight_layout()

insert image description here

plt.figure(figsize=(12, 9))

for i, company in enumerate(company_list, 1):
    plt.subplot(2, 2, i)
    company['Daily Return'].hist(bins=50)
    plt.xlabel('Daily Return')
    plt.ylabel('Counts')
    plt.title("TESLA STOCK PRICE")
    
plt.tight_layout()

insert image description here

# We can simply call pairplot on our DataFrame for an automatic visual analysis 
# of all the comparisons

sns.pairplot(df, kind='reg')

insert image description here

plt.figure(figsize=(16,6))
plt.title('Close Price History')
plt.plot(df['Close'])
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price USD ($)', fontsize=18)
plt.show()

insert image description here

# Create a new dataframe with only the 'Close column 
data = df.filter(['Close'])
# Convert the dataframe to a numpy array
dataset = data.values
# Get the number of rows to train the model on
training_data_len = int(np.ceil( len(dataset) * .95 ))

training_data_len

insert image description here

4. Data Standardization

Using sklearn to standardize, normalize and restore data
When training the model, in order to make the model converge as soon as possible, a common thing to do is to preprocess the data.
This is handled by using the sklearn.preprocess module.
1. The difference between standardization and normalization
Normalization is actually a way of standardization, but normalization maps the data to the interval [0,1].
Standardization is to scale the data proportionally so that it falls into a specific interval. The mean of the standardized data=0, and the standard deviation=1, so the standardized data can be positive or negative.

# Scale the data
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0,1))
scaled_data = scaler.fit_transform(dataset)

scaled_data

insert image description here

5. Divide training set and test set

Training set

# Create the training data set 
# Create the scaled training data set
train_data = scaled_data[0:int(training_data_len), :]
# Split the data into x_train and y_train data sets
x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i, 0])
    y_train.append(train_data[i, 0])
    if i<= 61:
        print(x_train)
        print(y_train)
        print()
        
# Convert the x_train and y_train to numpy arrays 
x_train, y_train = np.array(x_train), np.array(y_train)

# Reshape the data
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
# x_train.shape

test set

# Create the testing data set
# Create a new array containing scaled values from index 1543 to 2002 
test_data = scaled_data[training_data_len - 60: , :]
# Create the data sets x_test and y_test
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
    x_test.append(test_data[i-60:i, 0])
    
# Convert the data to a numpy array
x_test = np.array(x_test)

# Reshape the data
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1 ))

6. Establish LSTM model and train the model

from keras.models import Sequential
from keras.layers import Dense, LSTM

# Build the LSTM model
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)

insert image description here

7. Forecast

# Get the models predicted price values 
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)

# Get the root mean squared error (RMSE)
rmse = np.sqrt(np.mean(((predictions - y_test) ** 2)))
rmse

insert image description here
Print prediction results

# Plot the data
train = data[:training_data_len]
valid = data[training_data_len:]
valid['Predictions'] = predictions
# Visualize the data
plt.figure(figsize=(16,6))
plt.title('Model')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price USD ($)', fontsize=18)
plt.plot(train['Close'])
plt.plot(valid[['Close', 'Predictions']])
plt.legend(['Train', 'Val', 'Predictions'], loc='lower right')
plt.show()

insert image description here

Guess you like

Origin blog.csdn.net/weixin_39559994/article/details/128781163