爬虫实现股票分析(三)

现在是一个吃快餐的时代,先给出源码,还有执行结果吧,定位过程在下面

源码:

#导入需要使用到的模块
import urllib
import urllib.request
import re
import os
import pandas as pd
import pymysql

##########################将股票数据存入数据库###########################
filepath = 'D:\\data\\python\\test\\stock\\'#定义数据文件保存路径

#数据库名称和密码
name = 'root'
password = 'Test_123'  #替换为自己的账户名和密码
#建立本地数据库连接(需要先开启数据库服务)
db = pymysql.connect('localhost', name, password, charset='utf8')
cursor = db.cursor()
#创建数据库stockDataBase
sqlSentence1 = "create database if not exists stockDataBase"
cursor.execute(sqlSentence1)#选择使用当前数据库
sqlSentence2 = "use stockDataBase;"
cursor.execute(sqlSentence2)

#获取本地文件列表
fileList = os.listdir(filepath)
#依次对每个数据文件进行存储
for fileName in fileList:
    data = pd.read_csv(filepath+fileName, encoding="gbk")
   #创建数据表,如果数据表已经存在,会跳过继续执行下面的步骤print('创建数据表stock_%s'% fileName[0:6])
    sqlSentence3 = "create table if not exists stock_%s" % fileName[0:6] + "(日期 VARCHAR(20), 股票代码 VARCHAR(20),     name VARCHAR(40),\
                       收盘价 float,    最高价    float, 最低价 float, 开盘价 float, 前收盘 float, 涨跌额    float, \
                       涨跌幅 float, 换手率 float, 成交量 bigint, 成交金额 bigint, 总市值 bigint, 流通市值 bigint)"
    cursor.execute(sqlSentence3)
    #sqlSentence4 = "alter table stock_%S" % fileName[0:6] +  "default character set utf8;"
    sqlSentence4 = "alter table stock_%s" % fileName[0:6] + " change name name varchar(60) character set utf8;"
    cursor.execute(sqlSentence4)
    sqlSentence5 = "delete from stock_%s" % fileName[0:6] + ";"
    cursor.execute(sqlSentence5)

    #迭代读取表中每行数据,依次存储(整表存储还没尝试过)
    print('正在存储stock_%s'% fileName[0:6])
    length = len(data)
    print(length)
    for i in range(0, length):
        record = tuple(data.loc[i])
        #插入数据语句
        try:
            sqlSentence6 = "insert into stock_%s" % fileName[0:6] + "(日期, 股票代码, name, 收盘价, 最高价, 最低价, 开盘价, 前收盘, 涨跌额, 涨跌幅, 换手率, \
            成交量, 成交金额, 总市值, 流通市值) values ('%s',%s','%s',%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)" % record
            #获取的表中数据很乱,包含缺失值、Nnone、none等,插入数据库需要处理成空值
            sqlSentence6 = sqlSentence6.replace('nan','null').replace('None','null').replace('none','null')		
            cursor.execute(sqlSentence6)
        except:
            #如果以上插入过程出错,跳过这条数据记录,继续往下进行
            continue

#关闭游标,提交,关闭数据库连接
cursor.close()
db.commit()
db.close()

 

定位思路也很简单:
1.加入日志,打印出关键信息,这边就是最终sql语句
2.调试代码,加断点,这边没用到
期间要尝试修改,查手册,百度等,不要怕,比core问题好定位多了,试试就好了,实际上我还复习了tuple的用法。。。虽然问题不出在这儿,但是也加深一些知识

这边几张图只是给了点思路,在关键的地方输出日志,43 line 是否能按行读取出数据,50 line 查看最终的sql语句等,53 line看有没有进异常等,上面的几张图,说明日志也是慢慢添加的,慢慢合理的,多尝试即可

后面又加了日志:发现打印出来的最终sql语句是:
insert into stock_600000(日期, 股票代码, 名称, 收盘价, 最高价, 最低价, 开盘价, 前收盘, 涨跌额, 涨跌幅, 换手率,             成交量, 成交金额, 总市值, 流通市值) values ('2019-2-28',600000,'浦发银行',11.74,11.92,11.68,11.83,11.83,-0.09,-0.7608,0.1402,39393414,464139463,345000000000.0,330000000000.0)

然后查询表结构
+----------+-------------+------+-----+---------+-------+
| Field    | Type        | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| 日期     | varchar(20) | YES  |     | NULL    |       |
| 股票代码 | float       | YES  |     | NULL    |       |
| 名称     | varchar(60) | YES  |     | NULL    |       |
| 收盘价   | float       | YES  |     | NULL    |       |
| 最高价   | float       | YES  |     | NULL    |       |
| 最低价   | float       | YES  |     | NULL    |       |
| 开盘价   | float       | YES  |     | NULL    |       |
| 前收盘   | float       | YES  |     | NULL    |       |
| 涨跌额   | float       | YES  |     | NULL    |       |
| 涨跌幅   | float       | YES  |     | NULL    |       |
| 换手率   | float       | YES  |     | NULL    |       |
| 成交量   | bigint(20)  | YES  |     | NULL    |       |
| 成交金额 | bigint(20)  | YES  |     | NULL    |       |
| 总市值   | bigint(20)  | YES  |     | NULL    |       |
| 流通市值 | bigint(20)  | YES  |     | NULL    |       |
+----------+-------------+------+-----+---------+-------+
15 rows in set (0.01 sec)

发现是对的,单独执行sql语句,就有错误提示了

ERROR 1366 (HY000): Incorrect string value: '\xC6\xD6\xB7\xA2\xD2\xF8...' for column 'name' at row 1


查出来编码格式不对:
mysql> show create table stock_600000;
+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table        | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stock_600000 | CREATE TABLE `stock_600000` (
  `日期` varchar(20) DEFAULT NULL,
  `股票代码` float DEFAULT NULL,
  `名称` varchar(60) DEFAULT NULL,
  `收盘价` float DEFAULT NULL,
  `最高价` float DEFAULT NULL,
  `最低价` float DEFAULT NULL,
  `开盘价` float DEFAULT NULL,
  `前收盘` float DEFAULT NULL,
  `涨跌额` float DEFAULT NULL,
  `涨跌幅` float DEFAULT NULL,
  `换手率` float DEFAULT NULL,
  `成交量` bigint(20) DEFAULT NULL,
  `成交金额` bigint(20) DEFAULT NULL,
  `总市值` bigint(20) DEFAULT NULL,
  `流通市值` bigint(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)


mysql> alter table stock_600000 default character set utf8;
改了表的编码格式:


mysql> show create table stock_600000;
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table        | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stock_600000 | CREATE TABLE `stock_600000` (
  `日期` varchar(20) CHARACTER SET latin1 DEFAULT NULL,
  `股票代码` float DEFAULT NULL,
  `名称` varchar(60) CHARACTER SET latin1 DEFAULT NULL,
  `收盘价` float DEFAULT NULL,
  `最高价` float DEFAULT NULL,
  `最低价` float DEFAULT NULL,
  `开盘价` float DEFAULT NULL,
  `前收盘` float DEFAULT NULL,
  `涨跌额` float DEFAULT NULL,
  `涨跌幅` float DEFAULT NULL,
  `换手率` float DEFAULT NULL,
  `成交量` bigint(20) DEFAULT NULL,
  `成交金额` bigint(20) DEFAULT NULL,
  `总市值` bigint(20) DEFAULT NULL,
  `流通市值` bigint(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

执行还是报错,再查看具体的字段的编码格式

mysql> SHOW FULL COLUMNS FROM stock_600000;
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| Field     | Type        | Collation         | Null | Key | Default | Extra | Privileges                      | Comment |
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| dateday   | varchar(20) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references |         |
| stockcode | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| name      | varchar(60) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references |         |
| price1    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price2    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price3    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price4    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price5    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price6    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price7    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price8    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price9    | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price10   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price11   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price12   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
15 rows in set (0.01 sec)
表列名字被我换成中文了,为了解决上面的问题,然而并没用,主要 是name的格式:latin1_swedish_ci,插入的字段值有中文编码格式与其不一致,因此改了

mysql> alter table stock_600000 change name name varchar(60) character set utf8;


mysql> show full columns from stock_600000
    -> ;
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| Field     | Type        | Collation         | Null | Key | Default | Extra | Privileges                      | Comment |
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
| dateday   | varchar(20) | latin1_swedish_ci | YES  |     | NULL    |       | select,insert,update,references |         |
| stockcode | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| name      | varchar(60) | utf8_general_ci   | YES  |     | NULL    |       | select,insert,update,references |         |
| price1    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price2    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price3    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price4    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price5    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price6    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price7    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price8    | float       | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price9    | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price10   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price11   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
| price12   | bigint(20)  | NULL              | YES  |     | NULL    |       | select,insert,update,references |         |
+-----------+-------------+-------------------+------+-----+---------+-------+---------------------------------+---------+
15 rows in set (0.02 sec)

再插入就成功了,最终效果见上面

写代码的小熊猫~ :)

猜你喜欢

转载自www.cnblogs.com/effortscodepanda/p/10463476.html
今日推荐