python parsing csv file

Typical dataset stocks.csv:
insert picture description here
A stock dataset is actually common tabular data. With ticker, price, date, time, price movement and volume. This dataset is actually a tabular data with its own head and body.

The first trick: simple reading
Let's first look at a simple reading method. First, use the csv.reader() function to read the file handle f to generate a csv handle, which is actually an iterator. Let's take a look at this reader Source code:
insert image description here Feed the
reader an iterable object or file object, and return an iterable object.
insert image description here

First read the csv file, then use csv.reader to generate a csv iterator f_csv
and then use the characteristics of the iterator, next(f_csv) to get the header of the csv file, which is the header of the table data and
then use the for loop to print row by row The content, that is, the body of the table data,
insert the picture description here The
second trick:
The first trick above is actually the easiest to use nametuple. Let's use nametuple to wrap the generated row data.
insert image description here

nametuple is actually a very useful class, this class belongs to the collections module, and this module is simply a treasure chest with a lot of awesome libraries;
here we use next(f_csv) to get the header of the table to initialize this Row ;
Then loop to construct the data of this Row, and feed the data of each row in our table into row_info in nametuple format;
the advantage of this is that you can access the data in this row_info as you like, just like accessing class data, For example, the third trick of row_info.price
: use tuple type conversion
If we are very clear about the type of each row of csv data, hey hey, we can use a set data format conversion header to convert the data.
Insert a picture description here
The steps of the operation are actually similar to the above, except that the cleaning of the data results is slightly different. Here is a very clever zip to construct a nested data list, and then use convert(data) to convert the data of each row in the csv file. This trick is really good!
Take a look at the results:
insert image description here

The fourth trick: The nametuple used in DictReader
is actually a data mapping. Is there any way to directly read the content of csv with the mapping method, and directly generate a dictionary? Is it very simple to
insert a picture description
? It turns out that the csv module has built-in DictReader() directly, reads it according to the dictionary method, and then generates an ordered dictionary. Take a look at the results:
Insert a picture description here . If you
are interested, you can take a look at this The source code of DictReader() is actually an internally constructed iterator class, and the internal __next__ is actually generated by OrderedDict(zip(self.fieldnames, row)).

The fifth trick: use dictionary conversion
If we need to clean the data in this csv, because it is all strings when read out, we need to update it to a specific data type. At this time, we can also use dictionary conversion. This trick is also Very ingenious, let's take a look at the source code:
insert a picture here to describe
the original data price Price and volume, I hope that the last read generated is a floating-point data and shaped data, so, use a dictionary to cleverly to update the key.

First we declare a custom type converter field_types;
then loop to generate an iterable object (key, conversion(row[key]);
finally update the same key in the dictionary, such as the content of row['price'] Will be updated
Reference link:

5 ways to read CSV files with Python https://mp.weixin.qq.com/s/cs4buSULva1FgCctp_fB6g

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324344326&siteId=291194637