python股票预测(tushare)-基于贝叶斯原始代码

1、tushare介绍

Tushare是一个免费、开源的python财经数据接口包。主要实现对股票等金融数据从数据采集、清洗加工 到 数据存储的过程,能够为金融分析人员提供快速、整洁、和多样的便于分析的数据,为他们在数据获取方面极大地减轻工作量,使他们更加专注于策略和模型的研究与实现上。考虑到Python pandas包在金融量化分析中体现出的优势,Tushare返回的绝大部分的数据格式都是pandas DataFrame类型,非常便于用pandas/NumPy/Matplotlib进行数据分析和可视化。

2、导入股票数据

利用Tushare导入数据,这里导入的是hs300的数据

import tushare as ts
import pandas as pd
import numpy as np
import talib
df = ts.get_hist_data('hs300')  

这是部分数据
在这里插入图片描述

3、talib计算指标

talib是计算股票各个指标的包,安装talib这篇博客写的相当清楚了,可以参考,记得选择合适自己Python版本的安装包https://blog.csdn.net/u010671948/article/details/79714647
这里计算了rsi指标和cci指标

data = pd.DataFrame()
data['a2'] = talib.RSI(np.array(df.close), 7) #%rsi指标
data['a4'] = talib.CCI(np.array(df.high),np.array(df.low),np.array(df.close),14)  #;%cci指标
a1 = 0
for i in range(1,len(df.close)):
    a1.append(df.close[i]-df.close[i-1])
data['a1']=a1 

y = []
for i in range(0,len(df.close)-1):
    y.append(df.close[i+1]-df.close[i])
y.append(0)
data['y']=y

4、数据处理

把部分数据分为0,1,-1

data['y'][data['y']>0]=1
data['y'][data['y']==0]=0
data['y'][data['y']<0]=-1

data['a1'][data['a1']>0]=1
data['a1'][data['a1']==0]=0
data['a1'][data['a1']<0]=-1

#a2<20,20≤a2≤80,a2>80
data['a2'][data['a2']<30]=1
data['a2'][data['a2']>60]=3
#data['a2'][data['a2']>=20][data['a2']<=80]=2
#and data['a2']<=80]]=2
for i in range(len(data['a2'])):
    if 30<=data.a2[i]<=60:
        data.loc[i,'a2']=2
        

#a4<-100,-100≤a3≤100,a3>100
#data['a4'][data['a4']<-50]=1
#data['a4'][data['a4']>50]=3
#data['a2'][data['a2']>=20][data['a2']<=80]=2
#and data['a2']<=80]]=2
for i in range(len(data['a4'])):
    if -50<=data.a4[i]<=50:
        data.loc[i,'a4']=2
    elif data.a4[i]<-50: 
        data.loc[i,'a4']=1
    else:
        data.loc[i,'a4']=3

5、贝叶斯建模

用贝叶斯公示进行建模

y_ = [-1,0,1]
a1_ = [-1,0,1]
dic = {}

for i in y_:
    for j in a1_:
        dic['y'+str(i)+'_a1'+str(j)] = sum(
                data['a1'][data['y']==i]==j)/sum(
                        data['y']==i)

a2_ = [1,2,3]
a4_ = [1,2,3]
for i in y_:
    for j in a2_:
        dic['y'+str(i)+'_a2'+str(j)] = sum(
                data['a2'][data['y']==i]==j)/sum(
                        data['y']==i)
for i in y_:
    for j in a4_:
        dic['y'+str(i)+'_a4'+str(j)] = sum(
                data['a4'][data['y']==i]==j)/sum(
                        data['y']==i)
data.iloc[19,:]

y_ = sum(data['y']==-1)/len(data['y'])
y0 = sum(data['y']==0)/len(data['y'])
y1 = sum(data['y']==1)/len(data['y'])

py_ = y_*dic['y-1_a22']*dic['y-1_a41']*dic['y-1_a1-1']
py0 = y0*dic['y0_a22']*dic['y0_a41']*dic['y0_a1-1']
py1 = y1*dic['y1_a22']*dic['y1_a41']*dic['y1_a1-1']
发布了16 篇原创文章 · 获赞 9 · 访问量 1036

猜你喜欢

转载自blog.csdn.net/qq_42871249/article/details/104274239
今日推荐