Python Stock Analysis Series - basic stock data manipulation (b) Python Stock Analysis Series - stock data base operations (b)

Python Stock Analysis Series - stock data base operations (b)

 

This video series has been transported to bilibili:  Click to view

Welcome to Part 4 Python for Finance tutorial series. In this tutorial, we'll create Candlestick / OHLC chart based Adj Close column, which would allow me to introduce re-sampling and other data visualization concept.

FIG OHLC Candlestick Chart called the opening price of an Open, the highest price High, Low and lowest all Close dataset closing price of a good in a graph format. In addition, it makes a beautiful color, and remember I told you about the appearance of the charts?

In the previous tutorial we have been involved in this:

 

 
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib import style
import pandas as pd
import pandas_datareader.data as web
style.use('ggplot')

df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0)
 

Unfortunately, even if you create OHLC data directly from the Pandas produced Candlestick is not built-in. I am convinced that the future of this type of chart will be provided, but not now. It does not matter, we can make it! First, we need to import two new libraries:

 

from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates

The first is the introduction of OHLC matplotlib from picture type, is introduced into the second type of special mdates, it mostly just a pain in the ass, but this is a date type matplotlib pattern. pandas will be automatically processed for you, but like I said, we do not have the luxury candlestick.

First, we need proper OHLC data. Our current data does have value OHLC, unless I'm wrong, Tesla has never been split, but you will never be so lucky. Therefore, we will create our own OHLC data, which will also allow us to show another from Pandas data conversion:

 

df_ohlc = df['Adj Close'].resample('10D').ohlc()

Here we have done is to create a basis df [ 'Adj Close'] column of the new data frame, the 10-day window re-encapsulated, and a resampling is OHLC (open level off). We can also make the sum of the average of 10 days or 10 days with .mean () or .sum (). Remember, this 10-day average is the average of 10 days, rather than the average. Since our data are daily data, thus re-sampling of 10 days of data will significantly reduce the size of the data. This is how you can standardize multiple data sets. Sometimes, you may record monthly data recorded in the month, every month, the other end of each month the data record and, ultimately, some of the data were recorded weekly. You can re-sampling each month to the end of the frame data, and effectively standardized! If you like it, which is more advanced features of the Panda, you can learn more from the panda family.

We want to draw candlestick data and volume data. We do not have to re-sampled data, but we should, because it is too delicate compared to our 10D pricing data.

 

df_volume = df['Volume'].resample('10D').sum()

Here we use the money, because we really want to know the total amount of this transaction within 10 days, but you can also use average. Now if we do this:

 

print(df_ohlc.head())

we got:

 

 
                 open       high        low      close
Date                                                  
2010-06-29  23.889999  23.889999  15.800000  17.459999
2010-07-09  17.400000  20.639999  17.049999  20.639999
2010-07-19  21.910000  21.910000  20.219999  20.719999
2010-07-29  20.350000  21.950001  19.590000  19.590000
2010-08-08  19.600000  19.600000  17.600000  19.150000
 

This is expected, however, we now want to move to matplotlib this information, and the date is converted to mdates version. Since we just want to draw in a column in Matplotlib, so we actually do not want the date to be indexed, so we can do this:

 

df_ohlc = df_ohlc.reset_index()

The date now just an ordinary column. Next, we want to convert it:

 

df_ohlc ['Date'] = df_ohlc ['Date']。map(mdates.date2num)

Now we want to set this number:

 

fig = plt.figure()
ax1 = plt.subplot2grid((6,1),(0,0),rowspan = 5,colspan = 1)
ax2 = plt.subplot2grid((6,1),(5,0),rowspan = 1,colspan = 1,sharex = ax1)
ax1.xaxis_date()

In addition to ax1.xaxis_date (), you've seen all the content. This is the shaft into a date from the original generation number for us.

Now we can draw candlestick chart:

 

candlestick_ohlc(ax1,df_ohlc.values,width = 2,colorup ='g')

Then do the amount of:

 

ax2.fill_between(df_volume.index.map(mdates.date2num),df_volume.values,0)

fill_between function will draw x, y, and then filling contents / between. In our example, we select 0.

 

plt.show()

 

Complete code:

 

 
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib import style
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates
import pandas as pd
import pandas_datareader.data as web
style.use('ggplot')

df = pd.read_csv('tsla.csv', parse_dates=True, index_col=0)

df_ohlc = df['Adj Close'].resample('10D').ohlc()
df_volume = df['Volume'].resample('10D').sum()

df_ohlc.reset_index(inplace=True)
df_ohlc['Date'] = df_ohlc['Date'].map(mdates.date2num)

ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1)
ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1,sharex = ax1) 
ax1.xaxis_date ()

candlestick_ohlc(ax1, df_ohlc.values, width=5, colorup='g')
ax2.fill_between(df_volume.index.map(mdates.date2num), df_volume.values, 0)
plt.show()

This video series has been transported to bilibili:  Click to view

Guess you like

Origin www.cnblogs.com/medik/p/10989784.html