金融量化学习

1. 获取茅台股票数据并存储到本地csv文件

 1 Signature:
 2 ts.get_k_data(
 3     code=None,
 4     start='',
 5     end='',
 6     ktype='D',
 7     autype='qfq',
 8     index=False,
 9     retry_count=3,
10     pause=0.001,
11 )
12 Docstring:
13 获取k线数据
14 ---------
15 Parameters:
16   code:string
17               股票代码 e.g. 600848
18   start:string
19               开始日期 format:YYYY-MM-DD 为空时取上市首日
20   end:string
21               结束日期 format:YYYY-MM-DD 为空时取最近一个交易日
22   autype:string
23               复权类型,qfq-前复权 hfq-后复权 None-不复权,默认为qfq
24   ktype:string
25               数据类型,D=日k线 W=周 M=月 5=5分钟 15=15分钟 30=30分钟 60=60分钟,默认为D
26   retry_count : int, 默认 3
27              如遇网络等问题重复执行的次数 
28   pause : int, 默认 0
29             重复请求数据过程中暂停的秒数,防止请求间隔时间太短出现的问题
30 return
31 -------
32   DataFrame
33       date 交易日期 (index)
34       open 开盘价
35       high  最高价
36       close 收盘价
37       low 最低价
38       volume 成交量
39       amount 成交额
40       turnoverratio 换手率
41       code 股票代码
42 File:      /usr/local/python3.6/lib/python3.6/site-packages/tushare/stock/trading.py
43 Type:      function
tushare的get_k_data接口
1 import tushare as ts
2 
3 token = 'Your token'
4 ts.set_token(token)
5 
6 pro = ts.pro_api()
7 df = pro.daily(ts_code='xxx')
tushare的Pro接口
In [1]: import numpy as np                                                                                                                                                                                                                

In [2]: import pandas as pd                                                                                                                                                                                                               

In [3]: import tushare as ts                                                                                                                                                                                                              

In [4]: import matplotlib.pyplot as plt                                                                                                                                                                                                   

In [5]: df = ts.get_k_data("600519",start='1988-01-01')                                                                                                                                                                                   

In [6]: df.to_csv('600519.csv')                                                                                                                                                                                                           

In [7]: df = pd.read_csv('600519.csv',index_col='date',parse_dates=['date'])[['open','close','high','low']]                                                                                                                               

In [8]: df.head()                                                                                                                                                                                                                         
Out[8]: 
             open  close   high    low
date                                  
2001-08-27  5.392  5.554  5.902  5.132
2001-08-28  5.467  5.759  5.781  5.407
2001-08-29  5.777  5.684  5.781  5.640
2001-08-30  5.668  5.796  5.860  5.624
2001-08-31  5.804  5.782  5.877  5.749

In [9]: df.tail()                                                                                                                                                                                                                         
Out[9]: 
              open   close    high     low
date                                      
2019-04-15  931.00  907.00  939.00  907.00
2019-04-16  904.90  939.90  939.90  901.22
2019-04-17  938.00  952.00  955.51  925.00
2019-04-18  945.41  945.50  954.68  936.22
2019-04-19  943.96  952.56  960.95  931.31

In [10]: 

2. 输出该股票所有收盘比开盘上涨8%以上的日期

In [13]: df[ (df['close']-df['open'])/df['open']>=0.08 ]                                                                                                                                                                                  
Out[13]: 
               open    close     high      low
date                                          
2004-03-02    5.463    6.031    6.079    5.463
2005-06-08   11.383   12.555   12.639   11.383
2006-02-10   14.894   16.165   16.310   14.796
2006-05-29   25.024   27.520   27.520   25.024
2006-12-18   49.409   54.051   54.214   49.409
2007-06-11   67.313   73.569   73.569   66.913
2007-10-09   92.221   99.938  101.348   92.221
2007-12-14  125.269  135.970  137.154  124.029
2008-11-14   57.017   62.417   62.417   57.017
2009-03-04   73.123   79.024   79.961   72.756
2015-04-16  177.135  192.185  192.185  176.523
2015-07-09  201.180  219.085  221.182  197.901

In [14]: df[ (df['close']-df['open'])/df['open']>=0.1 ]                                                                                                                                                                                   
Out[14]: 
              open   close    high     low
date                                      
2004-03-02   5.463   6.031   6.079   5.463
2005-06-08  11.383  12.555  12.639  11.383


# 国内股票限制:最多涨10%
In [15]: df[ (df['close']-df['open'])/df['open']>=0.11 ]                                                                                                                                                                                  
Out[15]: 
Empty DataFrame
Columns: [open, close, high, low]
Index: []

In [16]: 
In [17]: df[ (df['close']-df['open'])/df['open']>=0.08 ].index                                                                                                                                                                            
Out[17]: 
DatetimeIndex(['2004-03-02', '2005-06-08', '2006-02-10', '2006-05-29',
               '2006-12-18', '2007-06-11', '2007-10-09', '2007-12-14',
               '2008-11-14', '2009-03-04', '2015-04-16', '2015-07-09'],
              dtype='datetime64[ns]', name='date', freq=None)

In [18]: 

3. 输出该股票所有开盘价比前日收盘价跌幅超过5%的日期

涉及到和前一天数据相比,笨办法自己写for循环,

好在pandas提供了shift函数,非常方便的移动df的数据

In [30]: df.head()                                                                                                                                                                                                                        
Out[30]: 
             open  close   high    low
date                                  
2001-08-27  5.392  5.554  5.902  5.132
2001-08-28  5.467  5.759  5.781  5.407
2001-08-29  5.777  5.684  5.781  5.640
2001-08-30  5.668  5.796  5.860  5.624
2001-08-31  5.804  5.782  5.877  5.749

In [31]: df['close'].shift(1).head()                                                                                                                                                                                                      
Out[31]: 
date
2001-08-27      NaN
2001-08-28    5.554
2001-08-29    5.759
2001-08-30    5.684
2001-08-31    5.796
Name: close, dtype: float64

In [32]: df[ (df['open']-df['close'].shift(1))/df['close'].shift(1)<=-0.05 ]                                                                                                                                                              
Out[32]: 
               open    close     high      low
date                                          
2008-03-13  124.709  133.893  135.341  120.258
2012-11-22  150.981  158.104  158.228  150.471
2015-07-08  194.504  201.180  208.085  186.656
2018-10-11  635.010  644.990  668.940  635.010
2018-10-29  549.090  549.090  549.090  549.090
2018-10-30  510.000  524.000  543.000  509.020

In [33]:   
In [33]: df[ (df['open']-df['close'].shift(1))/df['close'].shift(1)<=-0.05 ].index                                                                                                                                                        
Out[33]: 
DatetimeIndex(['2008-03-13', '2012-11-22', '2015-07-08', '2018-10-11',
               '2018-10-29', '2018-10-30'],
              dtype='datetime64[ns]', name='date', freq=None)

In [34]: 

4. 假如我从2001年1月1日开始,每月第一个交易日买入1手(1手等于100股)股票,每年最后一个交易日卖出所有股票,

那么迄今为止, 我的收益是多少?

In [72]: df = pd.read_csv('600519.csv',index_col='date',parse_dates=['date'])[['open','close','high','low']]                                                                                                                              

In [73]: df.head()                                                                                                                                                                                                                        
Out[73]: 
             open  close   high    low
date                                  
2001-08-27  5.392  5.554  5.902  5.132
2001-08-28  5.467  5.759  5.781  5.407
2001-08-29  5.777  5.684  5.781  5.640
2001-08-30  5.668  5.796  5.860  5.624
2001-08-31  5.804  5.782  5.877  5.749

In [74]: df.tail()                                                                                                                                                                                                                        
Out[74]: 
              open   close    high     low
date                                      
2019-04-15  931.00  907.00  939.00  907.00
2019-04-16  904.90  939.90  939.90  901.22
2019-04-17  938.00  952.00  955.51  925.00
2019-04-18  945.41  945.50  954.68  936.22
2019-04-19  943.96  952.56  960.95  931.31

In [75]: price_last = df['open'][-1]     # 记住当前最近一天开盘价                                                                                                                                                                                                 

In [76]: df = df['2001-09':'2019-03']    # 剔除首尾无用数据                                                                                                                                                                                  

In [77]: df.head()                                                                                                                                                                                                                        
Out[77]: 
             open  close   high    low
date                                  
2001-09-03  5.812  5.779  5.870  5.757
2001-09-04  5.782  5.852  5.949  5.762
2001-09-05  5.876  5.849  5.924  5.813
2001-09-06  5.835  5.734  5.854  5.704
2001-09-07  5.702  5.574  5.773  5.570

In [78]: df.tail()                                                                                                                                                                                                                        
Out[78]: 
              open   close    high     low
date                                      
2019-03-25  786.00  775.60  788.00  773.30
2019-03-26  780.00  773.00  785.94  764.10
2019-03-27  781.00  788.50  793.88  775.00
2019-03-28  793.43  806.80  814.48  785.68
2019-03-29  835.00  853.99  866.68  830.17

In [79]: 


# 利用resample重新采样:获取每月第一天的股票信息
In [83]: df_monthly = df.resample("M").first()                                                                                                                                                                                            

In [84]: df.resample('A').last()                                                                                                                                                                                                          
Out[84]: 
               open    close     high      low
date                                          
2001-12-31    5.885    6.023    6.140    5.852
2002-12-31    4.473    4.448    4.504    4.447
2003-12-31    4.940    4.921    4.940    4.888
2004-12-31    9.325    9.310    9.579    9.168
2005-12-31   14.309   14.039   14.316   13.817
2006-12-31   53.482   54.946   57.617   52.900
2007-12-31  139.495  144.783  144.846  137.085
2008-12-31   68.502   68.818   69.318   68.058
2009-12-31  107.993  108.369  108.516  107.718
2010-12-31  117.103  118.469  118.701  116.620
2011-12-31  138.039  138.468  139.600  136.105
2012-12-31  155.208  152.087  156.292  150.144
2013-12-31   93.188   96.480   97.179   92.061
2014-12-31  157.642  161.056  161.379  157.132
2015-12-31  207.487  207.458  208.704  207.106
2016-12-31  317.239  324.563  325.670  317.239
2017-12-31  707.948  687.725  716.329  681.918
2018-12-31  563.300  590.010  596.400  560.000
2019-12-31  835.000  853.990  866.680  830.170

# 利用resample重新采样:获取每年最后一天的股票信息
# 当前是2019-04-21,前面剔除了4月的数据,所以计算是到2019年3月
# 2019年还未结束,所以最后手里应该还持有2019年前三个月月初买入的股票
In [85]: df.resample('A').last()[:-1]                                                                                                                                                                                                     
Out[85]: 
               open    close     high      low
date                                          
2001-12-31    5.885    6.023    6.140    5.852
2002-12-31    4.473    4.448    4.504    4.447
2003-12-31    4.940    4.921    4.940    4.888
2004-12-31    9.325    9.310    9.579    9.168
2005-12-31   14.309   14.039   14.316   13.817
2006-12-31   53.482   54.946   57.617   52.900
2007-12-31  139.495  144.783  144.846  137.085
2008-12-31   68.502   68.818   69.318   68.058
2009-12-31  107.993  108.369  108.516  107.718
2010-12-31  117.103  118.469  118.701  116.620
2011-12-31  138.039  138.468  139.600  136.105
2012-12-31  155.208  152.087  156.292  150.144
2013-12-31   93.188   96.480   97.179   92.061
2014-12-31  157.642  161.056  161.379  157.132
2015-12-31  207.487  207.458  208.704  207.106
2016-12-31  317.239  324.563  325.670  317.239
2017-12-31  707.948  687.725  716.329  681.918
2018-12-31  563.300  590.010  596.400  560.000

In [86]: df_yearly = df.resample('A').last()[:-1]


# 计算2001年9月到2019年3月购买茅台股票的当前收益
# 当前收益包括两部分:之前每年清仓后的收益和当前持有2019年的股票市值
In [93]: cost = 0                                                                                                                                                                                                                         

In [94]: hold = 0                                                                                                                                                                                                                         

In [95]: for year in range(2001, 2020): 
    ...:     cost += df_monthly[str(year)]['open'].sum()*100 
    ...:     hold += len(df_monthly[str(year)])*100 
    ...:     if 2019 != year: 
    ...:         cost -= df_yearly[str(year)]['open'][0]*hold 
    ...:         hold = 0 
    ...:                                                                                                                                                                                                                                  

In [96]: cost -= hold*price_last                                                                                                                                                                                                          

In [97]: print(-cost)                                                                                                                                                                                                                     
454879.89999999985

In [98]: 

猜你喜欢

转载自www.cnblogs.com/standby/p/10745043.html