pandas_datareader
https://pandas-datareader.readthedocs.io/en/latest/
pandas_datareader支持的数据接口:
- Yahoo Finance:雅虎财经
- Google Finance:谷歌财经
- Enigma:the largest repository of public data. www.enigma.com
- Quandl:the world’s most powerful data. www.quandl.com 不稳定
- St.Louis FED: 圣路易斯联储。research.stlouisfed.org/docs/api/fred/
- Kenneth French’s data library:
- World Bank:世行公开数据https://data.worldbank.org/
- OECD:世经合数据库 http://stats.oecd.org/
- Eurostat:欧盟统计局 http://ec.europa.eu/eurostat/data/database
- hrift Savings Plan
- Nasdaq Trader symbol definitions
Data from Google Finance
aapl = DataReader(“AAPL”, “google”)
Price and volume data from IEX
tops = DataReader([“GS”, “AAPL”], “iex-tops”)
Top of book executions from IEX
gs = DataReader(“GS”, “iex-last”)
Real-time depth of book data from IEX
gs = DataReader(“GS”, “iex-book”)
Data from FRED
vix = DataReader(“VIXCLS”, “fred”)
Data from Fama/French
- ff = DataReader(“F-F_Research_Data_Factors”, “famafrench”)
- ff = DataReader(“F-F_Research_Data_Factors_weekly”, “famafrench”)
- ff = DataReader(“6_Portfolios_2x3”, “famafrench”)
- ff = DataReader(“F-F_ST_Reversal_Factor”, “famafrench”)
参数
web.DataReader(name, data_source=None, start=None, end=None, retry_count=3, pause=0.1, session=None, access_key=None)
name : str or list of strs
the name of the dataset. Some data sources (google, fred) will
accept a list of names.
data_source: {str, None}
the data source ("google", "fred", "ff")
start : {datetime, None}
left boundary for range (defaults to 1/1/2010)
end : {datetime, None}
right boundary for range (defaults to today)
retry_count : {int, 3}
Number of times to retry query request.
pause : {numeric, 0.001}
Time, in seconds, to pause between consecutive queries of chunks. If
single value given for symbol, represents the pause between retries.
session : Session, default None
requests.sessions.Session instance to be used
access_key : (str, None)
Optional parameter to specify an API key for certain data sources.
比较大的公司的代码
{“谷歌”:“GOOG”,“亚马逊”:“AMZN”,“Facebook”:“FB”,“苹果”:“AAPL”,“阿里巴巴”:“BABA”,“腾讯”:“0700.hk”}
对了接口,用的最多的是雅虎财经和谷歌财经
# 简单案例
start = datetime.datetime(2019, 1, 2) # or start = '2/1/2019'
end = datetime.date.today()
prices = web.DataReader('AAPL', 'yahoo', start, end) # 得到AApl的股票数据
prices.head() # print first rows of the prices data
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
Date | ||||||
2019-01-02 | 158.850006 | 154.229996 | 154.889999 | 157.919998 | 37039700.0 | 157.245605 |
2019-01-03 | 145.720001 | 142.000000 | 143.979996 | 142.190002 | 91244100.0 | 141.582779 |
2019-01-04 | 148.550003 | 143.800003 | 144.529999 | 148.259995 | 58607100.0 | 147.626846 |
2019-01-07 | 148.830002 | 145.899994 | 148.699997 | 147.929993 | 54777800.0 | 147.298264 |
2019-01-08 | 151.820007 | 148.520004 | 149.559998 | 150.750000 | 41025300.0 | 150.106216 |
股价数据
gafataDict={"谷歌":"GOOG","亚马逊":"AMZN","Facebook":"FB","苹果":"AAPL","阿里巴巴":"BABA","腾讯":"0700.hk"}
googDF=web.get_data_yahoo(gafataDict["谷歌"],start,end)
amznDF=web.get_data_yahoo(gafataDict["亚马逊"],start,end)
fbDF=web.get_data_yahoo(gafataDict["Facebook"],start,end)
aaplDF=web.get_data_yahoo(gafataDict["苹果"],start,end)
babaDF=web.get_data_yahoo(gafataDict["阿里巴巴"],start,end)
txDF=web.get_data_yahoo(gafataDict["腾讯"],start,end)
# 谷歌数据
googDF.describe()
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
count | 79.000000 | 79.000000 | 79.000000 | 79.000000 | 7.900000e+01 | 79.000000 |
mean | 1152.496645 | 1133.781999 | 1142.677534 | 1144.855383 | 1.372918e+06 | 1144.855383 |
std | 59.993213 | 62.881096 | 61.846521 | 61.231889 | 4.788185e+05 | 61.231889 |
min | 1051.530029 | 1014.070007 | 1016.570007 | 1016.059998 | 3.625480e+05 | 1016.059998 |
25% | 1101.875000 | 1086.275024 | 1092.474976 | 1095.035034 | 1.049650e+06 | 1095.035034 |
50% | 1146.849976 | 1118.500000 | 1126.729980 | 1140.989990 | 1.292600e+06 | 1140.989990 |
75% | 1202.559998 | 1189.489990 | 1197.140015 | 1198.049988 | 1.526700e+06 | 1198.049988 |
max | 1269.000000 | 1255.000000 | 1264.770020 | 1264.550049 | 3.552200e+06 | 1264.550049 |
# 亚马逊
amznDF.describe()
High | Low | Open | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
count | 79.000000 | 79.000000 | 79.000000 | 79.000000 | 7.900000e+01 | 79.000000 |
mean | 1721.099626 | 1689.175254 | 1704.058855 | 1707.603102 | 4.761566e+06 | 1707.603102 |
std | 96.013861 | 102.261977 | 100.714869 | 98.461742 | 1.848340e+06 | 98.461742 |
min | 1538.000000 | 1460.930054 | 1465.199951 | 1500.280029 | 1.226319e+06 | 1500.280029 |
25% | 1648.914978 | 1614.594971 | 1629.090027 | 1633.155029 | 3.517700e+06 | 1633.155029 |
50% | 1683.479980 | 1660.979980 | 1670.750000 | 1671.729980 | 4.324800e+06 | 1671.729980 |
75% | 1810.720032 | 1774.994995 | 1794.630005 | 1790.515015 | 5.775800e+06 | 1790.515015 |
max | 1929.689941 | 1902.324951 | 1925.000000 | 1923.770020 | 1.150620e+07 | 1923.770020 |
股利数据
from pandas_datareader import data as web
start = "2010-01-01"
end = '2019-4-25'
actions = web.DataReader('AAPL', 'yahoo-actions', start, end) # 将接口改为yahoo-actions
actions
action | value | |
---|---|---|
2019-02-08 | DIVIDEND | 0.730000 |
2018-11-08 | DIVIDEND | 0.730000 |
2018-08-10 | DIVIDEND | 0.730000 |
2018-05-11 | DIVIDEND | 0.730000 |
2018-02-09 | DIVIDEND | 0.630000 |
2017-11-10 | DIVIDEND | 0.630000 |
2017-08-10 | DIVIDEND | 0.630000 |
2017-05-11 | DIVIDEND | 0.630000 |
2017-02-09 | DIVIDEND | 0.570000 |
2016-11-03 | DIVIDEND | 0.570000 |
2016-08-04 | DIVIDEND | 0.570000 |
2016-05-05 | DIVIDEND | 0.570000 |
2016-02-04 | DIVIDEND | 0.520000 |
2015-11-05 | DIVIDEND | 0.520000 |
2015-08-06 | DIVIDEND | 0.520000 |
2015-05-07 | DIVIDEND | 0.520000 |
2015-02-05 | DIVIDEND | 0.470000 |
2014-11-06 | DIVIDEND | 0.470000 |
2014-08-07 | DIVIDEND | 0.470000 |
2014-06-09 | SPLIT | 7.000000 |
2014-05-08 | DIVIDEND | 0.470000 |
2014-02-06 | DIVIDEND | 0.435714 |
2013-11-06 | DIVIDEND | 0.435714 |
2013-08-08 | DIVIDEND | 0.435714 |
2013-05-09 | DIVIDEND | 0.435714 |
2013-02-07 | DIVIDEND | 0.378571 |
2012-11-07 | DIVIDEND | 0.378571 |
2012-08-09 | DIVIDEND | 0.378571 |
from pandas_datareader import data as web
actions = web.DataReader('GOOG', 'yahoo-actions', start, end) # 将接口改为yahoo-actions
actions
web
action | value |
---|
分红数据
start = "2010-01-01"
dividends=web.DataReader("GOOG", 'yahoo-dividends', start, end) # 分红数据yahoo-dividends
dividends
action | value |
---|
from pandas_datareader import data
import matplotlib.pyplot as plt
import pandas as pd
stock_code = input("美股直接输入股票代码如GOOG \n港股输入代码+对应股市,如腾讯:0700.hk \n国内股票需要区分上证和深证,股票代码后面加.ss或者.sz\n请输入你要查询的股票代码:")
start_date = "2000-11-01"
end_date = "2018-11-01"
stock_info = data.get_data_yahoo(stock_code, start_date, end_date)
# 展示前5行
print(stock_info.head())
# print(stock_info.info())
# 保存为Excel文件和CSV文件
stock_info.to_excel('%s.xlsx'%stock_code)
stock_info.to_csv('%s.csv'%stock_code)
# 输出图表
plt.plot(stock_info['Close'], 'g')
plt.show()
美股直接输入股票代码如GOOG
港股输入代码+对应股市,如腾讯:0700.hk
国内股票需要区分上证和深证,股票代码后面加.ss或者.sz
请输入你要查询的股票代码:GOOG
High Low Open Close Volume Adj Close
Date
2004-08-19 51.693783 47.669952 49.676899 49.845802 44994500.0 49.845802
2004-08-20 54.187561 49.925285 50.178635 53.805050 23005800.0 53.805050
2004-08-23 56.373344 54.172661 55.017166 54.346527 18393200.0 54.346527
2004-08-24 55.439419 51.450363 55.260582 52.096165 15361800.0 52.096165
2004-08-25 53.651051 51.604362 52.140873 52.657513 9257400.0 52.657513