When reading large files, if you have to traverse the data (usually when it is unstructured, it will be traversed, otherwise it will be directly dataframe), use the following: (If rb is changed to r, it will be n times slower.)
with open(filename,"rb") as f:
for fLine in f:
pass
Try to choose collections and dictionary data types. Don't choose lists (generally not when traversing) . The query speed of lists will be super slow. Similarly, if you have already used collections or dictionaries, don't convert them to List to operate.
(1). Optimal operation of dictionary:
if value in dict.values():
values_count += 1
#用下面这种,别用上面这种。
if keys,values in dict:
values_count += 1
(2). Use iteritems() more and less items(), iteritems() returns an iterator; the items function of a dictionary returns a list of key-value pairs, while iteritems uses a key-value pair generator.