Python Stock Analysis Series - stock data base operations (a)
This video series has been transported to bilibili: Click to view
Welcome to Part 3 Python for Finance tutorial series. In this tutorial, we will use the data to further break down our stock some basic data manipulation and visualization. We start code to be used (in the previous tutorial has been introduced) are:
import datetime as dt import matplotlib.pyplot as plt from matplotlib import style import pandas as pd import pandas_datareader.data as web style.use('ggplot') df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0)
pandas module can be used with a range of built-in features, and how to create a custom pandas function. Later on, we will introduce some custom functions, but for now let us a very common operation performed on these data: moving average moving averages.
Simple Moving Average The idea is to take the time window, and the average price of the window. Then we turned to the window for some time, and then do it again. In our example, we will implement the 100-day moving average (100mA) . Therefore, it will use the current price, and add up the price over the last 99 days divided by 100, and then determine the current 100-day moving average. Then we move the window over one day, and then do the same thing. In doing so pandas is very simple:
df['100ma'] = df['Adj Close'].rolling(window=100).mean()
Do df [ '100ma'] allows us to redefine the content includes existing columns (if we have named "100ma"), or create a new column, that's what we're doing here. We say df [ '100ma'] df applications listed equivalent to a scrolling method [ 'Adj Close'] column, window 100, the window will be a mean value () (Average) operation.
Now, we can do this:
print(df.head())
Date Open High Low Close Volume \ Date 2010-06-29 2010-06-29 19.000000 25.00 17.540001 23.889999 18766300 2010-06-30 2010-06-30 25.790001 30.42 23.299999 23.830000 17187100 2010-07-01 2010-07-01 25.000000 25.92 20.270000 21.959999 8218800 2010-07-02 2010-07-02 23.000000 23.10 18.709999 19.200001 5139800 2010-07-06 2010-07-06 20.000000 20.00 15.830000 16.110001 6866900 Adj Close 100ma Date 2010-06-29 23.889999 NaN 2010-06-30 23.830000 NaN 2010-07-01 21.959999 NaN 2010-07-02 19.200001 NaN 2010-07-06 16.110001 NaN
what happened? In 100ma column, we see only NaN. We chose 100 moving average, theoretically required data points 100 is calculated before, we first line 100 will not have any data. NaN means "Not a Number". In Pandas, you can decide to do a lot of things with missing data, but for now, let's actually just change the minimum cycle parameters:
df['100ma'] = df['Adj Close'].rolling(window=100,min_periods=0).mean() print(df.head())
Date Open High Low Close Volume \ Date 2010-06-29 2010-06-29 19.000000 25.00 17.540001 23.889999 18766300 2010-06-30 2010-06-30 25.790001 30.42 23.299999 23.830000 17187100 2010-07-01 2010-07-01 25.000000 25.92 20.270000 21.959999 8218800 2010-07-02 2010-07-02 23.000000 23.10 18.709999 19.200001 5139800 2010-07-06 2010-07-06 20.000000 20.00 15.830000 16.110001 6866900 Adj Close 100ma Date 2010-06-29 23.889999 23.889999 2010-06-30 23.830000 23.860000 2010-07-01 21.959999 23.226666 2010-07-02 19.200001 22.220000 2010-07-06 16.110001 20.998000
Look, now in force, and now we want to see it! But we have seen a simple chart, slightly more complicated things a little of it?
ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1) ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1,sharex=ax1)
Basically, we say we want to create two sub-graphs, and two sub-map will work like a 6x1 grid, we have 6 rows one. FIG first child from (0,0) starts on the grid across the line 5, and a cross. The next 6x1 grid is also located on the shaft, but it starts (5,0), and a row across 1. The second shaft also has sharex = ax1, ax2 which means that it will always be aligned with the x-axis and the x-axis ax1, and vice versa. Now we just made our land:
ax1.plot(df.index, df['Adj Close']) ax1.plot(df.index, df['100ma']) ax2.bar(df.index, df['Volume']) plt.show()
In summary, we draw close first axis and volume 100ma, a second shaft. Our results:
The complete code so far:
import datetime as dt import matplotlib.pyplot as plt from matplotlib import style import pandas as pd import pandas_datareader.data as web style.use('ggplot') df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0) df['100ma'] = df['Adj Close'].rolling(window=100, min_periods=0).mean() print(df.head()) ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1) ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1, sharex=ax1) ax1.plot(df.index, df['Adj Close']) ax1.plot(df.index, df['100ma']) ax2.bar(df.index, df['Volume']) plt.show()
In the next few sections tutorial, we will learn how to make candle holders resampling by Pandas graphic data, and learn more about the use of Matplotlib.
This video series has been transported to bilibili: Click to view
Welcome to Part 3 Python for Finance tutorial series. In this tutorial, we will use the data to further break down our stock some basic data manipulation and visualization. We start code to be used (in the previous tutorial has been introduced) are:
import datetime as dt import matplotlib.pyplot as plt from matplotlib import style import pandas as pd import pandas_datareader.data as web style.use('ggplot') df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0)
pandas module can be used with a range of built-in features, and how to create a custom pandas function. Later on, we will introduce some custom functions, but for now let us a very common operation performed on these data: moving average moving averages.
Simple Moving Average The idea is to take the time window, and the average price of the window. Then we turned to the window for some time, and then do it again. In our example, we will implement the 100-day moving average (100mA) . Therefore, it will use the current price, and add up the price over the last 99 days divided by 100, and then determine the current 100-day moving average. Then we move the window over one day, and then do the same thing. In doing so pandas is very simple:
df['100ma'] = df['Adj Close'].rolling(window=100).mean()
Do df [ '100ma'] allows us to redefine the content includes existing columns (if we have named "100ma"), or create a new column, that's what we're doing here. We say df [ '100ma'] df applications listed equivalent to a scrolling method [ 'Adj Close'] column, window 100, the window will be a mean value () (Average) operation.
Now, we can do this:
print(df.head())
Date Open High Low Close Volume \ Date 2010-06-29 2010-06-29 19.000000 25.00 17.540001 23.889999 18766300 2010-06-30 2010-06-30 25.790001 30.42 23.299999 23.830000 17187100 2010-07-01 2010-07-01 25.000000 25.92 20.270000 21.959999 8218800 2010-07-02 2010-07-02 23.000000 23.10 18.709999 19.200001 5139800 2010-07-06 2010-07-06 20.000000 20.00 15.830000 16.110001 6866900 Adj Close 100ma Date 2010-06-29 23.889999 NaN 2010-06-30 23.830000 NaN 2010-07-01 21.959999 NaN 2010-07-02 19.200001 NaN 2010-07-06 16.110001 NaN
what happened? In 100ma column, we see only NaN. We chose 100 moving average, theoretically required data points 100 is calculated before, we first line 100 will not have any data. NaN means "Not a Number". In Pandas, you can decide to do a lot of things with missing data, but for now, let's actually just change the minimum cycle parameters:
df['100ma'] = df['Adj Close'].rolling(window=100,min_periods=0).mean() print(df.head())
Date Open High Low Close Volume \ Date 2010-06-29 2010-06-29 19.000000 25.00 17.540001 23.889999 18766300 2010-06-30 2010-06-30 25.790001 30.42 23.299999 23.830000 17187100 2010-07-01 2010-07-01 25.000000 25.92 20.270000 21.959999 8218800 2010-07-02 2010-07-02 23.000000 23.10 18.709999 19.200001 5139800 2010-07-06 2010-07-06 20.000000 20.00 15.830000 16.110001 6866900 Adj Close 100ma Date 2010-06-29 23.889999 23.889999 2010-06-30 23.830000 23.860000 2010-07-01 21.959999 23.226666 2010-07-02 19.200001 22.220000 2010-07-06 16.110001 20.998000
Look, now in force, and now we want to see it! But we have seen a simple chart, slightly more complicated things a little of it?
ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1) ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1,sharex=ax1)
Basically, we say we want to create two sub-graphs, and two sub-map will work like a 6x1 grid, we have 6 rows one. FIG first child from (0,0) starts on the grid across the line 5, and a cross. The next 6x1 grid is also located on the shaft, but it starts (5,0), and a row across 1. The second shaft also has sharex = ax1, ax2 which means that it will always be aligned with the x-axis and the x-axis ax1, and vice versa. Now we just made our land:
ax1.plot(df.index, df['Adj Close']) ax1.plot(df.index, df['100ma']) ax2.bar(df.index, df['Volume']) plt.show()
In summary, we draw close first axis and volume 100ma, a second shaft. Our results:
The complete code so far:
import datetime as dt import matplotlib.pyplot as plt from matplotlib import style import pandas as pd import pandas_datareader.data as web style.use('ggplot') df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0) df['100ma'] = df['Adj Close'].rolling(window=100, min_periods=0).mean() print(df.head()) ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1) ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1, sharex=ax1) ax1.plot(df.index, df['Adj Close']) ax1.plot(df.index, df['100ma']) ax2.bar(df.index, df['Volume']) plt.show()
在接下来的几节教程中,我们将学习如何通过Pandas数据重采样制作烛台图形,并学习更多关于使用Matplotlib的知识。