This article is reproduced to the article: Pandas Check manual Chinese version , I added personal use case descriptions, as well as structures to explain some of DataFrame.
Pandas is a very important package, it can very easily structured data processing, for example, I liked to use it to deal with some csv and excel documents. If you want to learn Pandas, look at the two recommended sites:
- 官网:Python Data Analysis Library
- Ten minutes Getting Pandas: 10 Minutes to PANDAS
Import Data:
- pd.read_csv (filename): import data from a CSV file
- pd.read_table (filename): import data from a delimited text file defining
- pd.read_excel (filename): import data from the Excel file
- pd.read_sql (query, connection_object): import data from SQL table / database
- pd.read_json (json_string): import data from JSON-formatted string
- pd.read_html (url): parsing the URL, the HTML file or string, wherein the extracted tables form
- pd.read_clipboard (): Gets content from your clipboard, and passed read_table ()
- pd.DataFrame (dict): import data from the dictionary objects, Key is the column name, Value data
CSV and EXCEL:
The difference between excel and csv file documents that the former has a page (sheet), and a document editor, then will find open cells between excel row is a comma "," at the end, and is csv tab "\ t" end. Attention to the fact that not all of its contents are to get the file specification, such as a cell surface there is a large section of Gerry sentence contains a comma, it will lead to the formation DataFrame problems. So when processed in the import Python, it is necessary to pre-check in excel.
import pandas as pd def xlsx_to_csv_pd(): #读取excel实例 data_xls = pd.read_excel("test.xlsx",index_col=0 ) data_xls.to_csv("test.csv",encoding="utf-8")