Use Counter to count statistics
Surely everyone is familiar with the counting statistics it! , Simply means that count the number of an item that appears. Practical applications need to use a lot of demand for this model, such as the number of samples tested in a value occurs, the probability of frequency analysis log analysis files in a message that appears in the same string that appears and so on. This is similar to the demand there are a variety of implementations. We look at one by one using a different data structure implementation
First, the use dict
First, let's look at this approach to achieve dict, ado, take a look at this little show operation, the back end of the operation makes you look cool dramatically, completely let you meet, see performances:
some_data = ['a', '2', 2, 4, 5, '2', 'b', 4, 7, 'a', '5', 'd', 'a', 'z'] # 创建列表
count_frq = dict() # 创建列表
# 词频统计
for item in some_data:
if item in count_frq:
count_frq[item] += 1
else:
count_frq[item] = 1
print(count_frq)
result:
{'a': 3, '2': 2, 2: 1, 4: 2, 5: 1, 'b': 1, 7: 1, '5': 1, 'd': 1, 'z': 1}
Second, the method using the set list, and
Look at this little common method
some_data = ['a', '2', 2, 4, 5, '2', 'b', 4, 7, 'a', '5', 'd', 'a', 'z']
count_set = set(some_data) # 去重
count_list = []
for item in count_set:
count_list.append((item, some_data.count(item))) # 添加
print(count_list)
result:
[('5', 1), (2, 1), ('2', 2), (4, 2), (5, 1), (7, 1), ('a', 3), ('z', 1), ('b', 1), ('d', 1)]
Three, collections use
The above methods are relatively simple, but there is no more elegant, air show, more Pythonic the solution? Consider the following introduction defaultdict
1.1 defaultdict
from collections import defaultdict
some_data = ['a', '2', 2, 4, 5, '2', 'b', 4, 7, 'a', '5', 'd', 'a', 'z']
count_frq = defaultdict(int) # defaultdict(int)
# 统计计数
for item in some_data:
count_frq[item] += 1
print(count_frq)
result:
dict_items([('a', 3), ('2', 2), (2, 1), (4, 2), (5, 1), ('b', 1), (7, 1), ('5', 1), ('d', 1), ('z', 1)])
1.2 Counter
Counter class is starting to increase from Python2.7, belonging to a subclass of class dictionary, it is a container object, mainly used statistics # hash object that supports set operations +, -, &, |, and where & | operations were returned Counter two objects each element # maximum and minimum values. He offers three different ways to initialize, punctuality come, look at this amazing operation, greatly reduced the degree of lines of code:
from collections import Counter
some_data = ['a', '2', 2, 4, 5, '2', 'b', 4, 7, 'a', '5', 'd', 'a', 'z']
count_counter = Counter(some_data) # 统计
print(count_counter) # 结果就出来,就问你强不强
result:
Counter({'a': 3, '2': 2, 4: 2, 2: 1, 5: 1, 'b': 1, 7: 1, '5': 1, 'd': 1, 'z': 1})
- Counter can not only list the statistics on the line, he can count as any object can be iterative:
Iterable object string
Counter("success") # 可迭代对象 print(Counter("success"))
result:
Counter({'s': 3, 'c': 2, 'u': 1, 'e': 1})
Keyword arguments
Counter(s=3, c=2, e=1, u=1) # 关键字参数
print(Counter(s=3, c=2, e=1, u=1))
result:
Counter({'s': 3, 'c': 2, 'e': 1, 'u': 1})
- dictionary
Counter({'s': 3, 'c': 2, 'e': 1, 'u': 1})
print(Counter({'s': 3, 'c': 2, 'e': 1, 'u': 1}))
result:
Counter({'s': 3, 'c': 2, 'e': 1, 'u': 1})
Use elements () method to get the key value in the Counter
print(list(Counter(some_data).elements())) # 根据值打印key的次数
result:
['a', 'a', 'a', '2', '2', 2, 4, 4, 5, 'b', 7, '5', 'd', 'z']
See more show a gas, utilizing most_commo () N highest number of elements and their frequency of occurrence of the corresponding method can find the front.
count = Counter(some_data).most_common(2) # 获取前两个频率最高 print(count)
result:
[('a', 3), ('2', 2)]
When the element access does not exist, the default return 0 instead of throwing an exception keyError
print(Counter(some_data)["y"])
result:
0
update () method is used for adding the count value of the statistical update statistics object elements, new and old objects Counter counter element instead of directly replacing them
c = Counter("success") print(c)
result:
Counter({'s': 3, 'c': 2, 'u': 1, 'e': 1})
Updated on the basis of
c.update("successfully") print(c)
Counter({'s': 6, 'c': 4, 'u': 3, 'e': 2, 'l': 2, 'f': 1, 'y': 1})
subtract () methods for achieving the object element statistics counter subtracted from the input and output of the statistical value to allow zero or negative ( subtraction on the basis of the updated )
c = Counter("success") print(c)
result:
Counter({'s': 3, 'c': 2, 'u': 1, 'e': 1, 'f': 0, 'l': 0, 'y': 0})
With interest can be studied under their own Oh!