pandas 之 DataFrame 保存为文件 (df.to_csv、df.to_json、df.to_html、df.to_excel)

版权声明:本文为博主原创文章,欢迎讨论共同进步。 https://blog.csdn.net/tz_zs/article/details/81137998

____tz_zs

DataFrame 数据的保存和读取

df.to_csv 写入到 csv 文件

pd.read_csv 读取 csv 文件

df.to_json 写入到 json 文件

pd.read_json 读取 json 文件

df.to_html 写入到 html 文件

pd.read_html 读取 html 文件

df.to_excel 写入到 excel 文件

pandas.DataFrame.to_csv

将 DataFrame 写入到 csv 文件

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html

DataFrame.to_csv(path_or_buf=None, sep=', ', na_rep='', float_format=None, columns=None, header=True, index=True,
                 index_label=None, mode='w', encoding=None, compression=None, quoting=None, quotechar='"',
                 line_terminator='\n', chunksize=None, tupleize_cols=None, date_format=None, doublequote=True,
                 escapechar=None, decimal='.')

参数:

  • path_or_buf : 文件路径,如果没有指定则将会直接返回字符串的 json
  • sep : 输出文件的字段分隔符,默认为 “,”
  • na_rep : 用于替换空数据的字符串,默认为''
  • float_format : 设置浮点数的格式(几位小数点)
  • columns : 要写的列
  • header : 是否保存列名,默认为 True ,保存
  • index : 是否保存索引,默认为 True ,保存
  • index_label : 索引的列标签名

.

# -*- coding:utf-8 -*-

"""
@author:    tz_zs
"""

import numpy as np
import pandas as pd

list_l = [[11, 12, 13, 14, 15], [21, 22, 23, 24, 25], [31, 32, 33, 34, 35]]
date_range = pd.date_range(start="20180701", periods=3)
df = pd.DataFrame(list_l, index=date_range,
                  columns=['a', 'b', 'c', 'd', 'e'])
print(df)
"""
             a   b   c   d   e
2018-07-01  11  12  13  14  15
2018-07-02  21  22  23  24  25
2018-07-03  31  32  33  34  35
"""

df.to_csv("tzzs_data.csv")
"""
csv 文件内容:
,a,b,c,d,e
2018-07-01,11,12,13,14,15
2018-07-02,21,22,23,24,25
2018-07-03,31,32,33,34,35
"""
read_csv = pd.read_csv("tzzs_data.csv")
print(read_csv)
"""
   Unnamed: 0   a   b   c   d   e
0  2018-07-01  11  12  13  14  15
1  2018-07-02  21  22  23  24  25
2  2018-07-03  31  32  33  34  35
"""

df.to_csv("tzzs_data2.csv", index_label="index_label")
"""
csv 文件内容:
index_label,a,b,c,d,e
2018-07-01,11,12,13,14,15
2018-07-02,21,22,23,24,25
2018-07-03,31,32,33,34,35
"""

read_csv2 = pd.read_csv("tzzs_data2.csv")
print(read_csv2)
"""
  index_label   a   b   c   d   e
0  2018-07-01  11  12  13  14  15
1  2018-07-02  21  22  23  24  25
2  2018-07-03  31  32  33  34  35
"""

pandas.DataFrame.to_json

将 Dataframe 写入到 json 文件

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html

DataFrame.to_json(path_or_buf=None, orient=None, date_format=None, double_precision=10, force_ascii=True,
                  date_unit='ms', default_handler=None, lines=False, compression=None, index=True)

参数:

  • path_or_buf : 文件路径,如果没有指定则将会直接返回字符串的 json。

代码:

df.to_json("tzzs_data.json")

read_json = pd.read_json("tzzs_data.json")
print(read_json)
"""
             a   b   c   d   e
2018-07-01  11  12  13  14  15
2018-07-02  21  22  23  24  25
2018-07-03  31  32  33  34  35
"""

json 文件

{
    "a": {
        "1530403200000": 11,
        "1530489600000": 21,
        "1530576000000": 31
    },
    "b": {
        "1530403200000": 12,
        "1530489600000": 22,
        "1530576000000": 32
    },
    "c": {
        "1530403200000": 13,
        "1530489600000": 23,
        "1530576000000": 33
    },
    "d": {
        "1530403200000": 14,
        "1530489600000": 24,
        "1530576000000": 34
    },
    "e": {
        "1530403200000": 15,
        "1530489600000": 25,
        "1530576000000": 35
    }
}

pandas.DataFrame.to_html

将 Dataframe 写入到 html 文件

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_html.html

DataFrame.to_html(buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None,
                  float_format=None, sparsify=None, index_names=True, justify=None, bold_rows=True, classes=None,
                  escape=True, max_rows=None, max_cols=None, show_dimensions=False, notebook=False, decimal='.',
                  border=None, table_id=None)

代码:

df.to_html("tzzs_data.html")

read_html = pd.read_html("tzzs_data.html")
print(read_html)
"""
[   Unnamed: 0   a   b   c   d   e
0  2018-07-01  11  12  13  14  15
1  2018-07-02  21  22  23  24  25
2  2018-07-03  31  32  33  34  35]
"""

#
print(read_html[0])
"""
   Unnamed: 0   a   b   c   d   e
0  2018-07-01  11  12  13  14  15
1  2018-07-02  21  22  23  24  25
2  2018-07-03  31  32  33  34  35
"""

HTML文件:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>a</th>
      <th>b</th>
      <th>c</th>
      <th>d</th>
      <th>e</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>2018-07-01</th>
      <td>11</td>
      <td>12</td>
      <td>13</td>
      <td>14</td>
      <td>15</td>
    </tr>
    <tr>
      <th>2018-07-02</th>
      <td>21</td>
      <td>22</td>
      <td>23</td>
      <td>24</td>
      <td>25</td>
    </tr>
    <tr>
      <th>2018-07-03</th>
      <td>31</td>
      <td>32</td>
      <td>33</td>
      <td>34</td>
      <td>35</td>
    </tr>
  </tbody>
</table>

在浏览器中打开:

.

df.to_html 生成的是一个 html 格式的 table 表,我们可以在前后加入其他标签,丰富页面。ps:如果有中文字符,需要在 head 中设置编码格式。

参考:Pandas Dataframes to_html: Highlighting table rows

# -*- coding: utf-8 -*-
"""
@author: tz_zs
"""
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

index = ["2018-07-01", "2018-07-02", "2018-07-03", "2018-07-04"]
df = pd.DataFrame(index=index)
df["一"] = [11, 12, 13, 14]
df["二"] = [21, 22, 23, 24]
print(df)
"""
             一   二
2018-07-01  11  21
2018-07-02  12  22
2018-07-03  13  23
2018-07-04  14  24
"""

axes_subplot = df.plot()
# print(type(axes_subplot)) #<class 'matplotlib.axes._subplots.AxesSubplot'>
plt.xlabel("time")
plt.ylabel("num")
plt.legend(loc="best")
plt.grid(True)
plt.savefig("test.png")

HEADER = '''
    <html>
        <head>
            <meta charset="UTF-8">
        </head>
        <body>
    '''
FOOTER = '''
        <img src="%s" alt="" width="1200" height="600">
        </body>
    </html>
    ''' % ("test.png")
with open("test.html", 'w') as f:
    f.write(HEADER)
    f.write(df.to_html(classes='df'))
    f.write(FOOTER)

.

.

pandas.DataFrame.to_excel

将 DataFrame 写入 excel 文件

pandas.DataFrame.to_excel

DataFrame.to_excel(excel_writer, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, 
                   header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None,
                   merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)

.

#!/usr/bin/python2.7
# -*- coding:utf-8 -*-

"""
@author:    tz_zs
"""

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

list_l = [[1, 3, 3, 5, 4], [11, 7, 15, 13, 9], [4, 2, 7, 9, 3], [15, 11, 12, 6, 11]]
index = ["2018-07-01", "2018-07-02", "2018-07-03", "2018-07-04"]
df = pd.DataFrame(list_l, index=index, columns=['a', 'b', 'c', 'd', 'e'])
print(df)
"""
             a   b   c   d   e
2018-07-01   1   3   3   5   4
2018-07-02  11   7  15  13   9
2018-07-03   4   2   7   9   3
2018-07-04  15  11  12   6  11
"""

df.to_excel("test.xls")

.

其他文章:

http://www.dcharm.com/?p=584

https://blog.csdn.net/sinat_29957455/article/details/79059436

https://www.cnblogs.com/pengsixiong/p/5050833.html

.

end

猜你喜欢

转载自blog.csdn.net/tz_zs/article/details/81137998