python standard library to learn (1): collections

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/github_37999869/article/details/98665855
python standard library learning (1)

The standard library collections

2019-08-07

The standard library collectionsfor the python in the list (list), tuple (tuple), Dictionary (dict) reciprocal basis, an increase of several data types, which can be regarded as an extension of the original several types of data:

type of data effect
namedtuple It can be used to access the property name tuple
and Two-way (front and rear) queue
Counter It can be used to access the property name tuple
OrderedDict Ordered dictionary
defaultdict There defaults dictionary
ChainMap The multiple dictionaries combined in one map

In addition to these data types, as well as UserDict, UserList, UserStringthree abstract base class, to not discussed in this section.

1 collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)

namedtuple, you may be referred named tuple or name tuple is a tuple (tuple) inheritance, and expansion, is called a "factory function" (factory function). Generally, the tuple element (item) can only be accessed by an index (index), for example:

a_tuple = ('a', 'b', 'c', 2, 3, 4)
s = a_tuple[-1]  # 结果是 4 

Another way namedtuple extended, so that elements can be accessed by a tuple name (name) manner, the benefits of doing so is to store data utilizing the tuple, the meaning of each element becomes clear, for example, create Staff called the namedtuple:

from collections import namedtuple
 
Staff=namedtuple("Staff",['name','age','email'])

mars = Staff('Mars', 30, 'mars03@*****.com')
june = Staff('June', 27, 'june_hcs@*****.com')
june.email    #结果为'june_hcs@*****.com'

This can be found with the definition of a class constructor is called Staff (constructor, __init__) defined name attributes similar to the other three. This is also the reason it is called a factory function.
namedtuple encapsulates three useful functions: _make, _replaceand _asdict. _makeOrdinary tuples can be converted to a namedtuple:

# 将一个普通元组转为Staff
kar = ('Kar',31,'kcar@****.com')
kar_nmdtp = Staff._make(kar)    # Staff(name='Kar', age=31, email='kcar@****.com')

_replaceIt allows to replace the value in the tuple using the attribute name (name):

kar = Staff('Kar',31,'kcar@****.com')
kar_new = kar._replace(email='kar03@****.com')  # 注意,kar这个元组中email的值不会被修改

_asdictIt will be a namedtuple into a OrderedDict objects:

kar = Staff('Kar',31,'kcar@****.com')
kar._asdict()   # OrderedDict([('name', 'Kar'), ('age', 31), ('email', 'kcar@****.com')])
2 collections.deque([iterable[, maxlen]])

is an abbreviation queue bilateral deque (double-ended queue), the parameter maxlenspecifies the maximum length of this list.
Since called "bilateral queue", then the corresponding, the insert list, throwing and other operations can be extended to both sides, i.e. with adding a set leftoperation:

from collections import deque

dlst = deque(['a','b','c','d','e','f','g'])
dlst.append('h')              # deque(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
dlst.appendleft('0')          # deque(['0', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
dlst.pop()                    # deque(['0', 'a', 'b', 'c', 'd', 'e', 'f', 'g'])
dlst.popleft()                # deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
dlst.extend(['h','i'])        # deque(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'])
dlst.extendleft(['0','1'])    # deque(['1', '0', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'])
dlst.rotate(2)                # deque(['h', 'i', '1', '0', 'a', 'b', 'c', 'd', 'e', 'f', 'g'])

Finally, a rotate(n)method of action that, when n > > 0, the end to the beginning of n elements, and vice versa when n < < 0 then n elements move to the last head.
Parametermaxlencan be used to specify the maximum length of the list, when it reaches the maximum length beyond the length of the element to be removed from the list:

from collections import deque

dlst = deque(['a','b','c','d','e'], maxlen=8)
dlst.extend(i for i in 'hijklmn')    # deque(['e', 'h', 'i', 'j', 'k', 'l', 'm', 'n'])
3 collections.Counter([iterable-or-mapping])

CounterA container is a counting function category , a dictionary (dict) a subclass for hashable objects , Counterthe number of statistical elements. The number of times each character appears in the following example statistics from strings, returns a dictionary

from collections import Counter

s_str = '''It was the best of times, \
it was the worst of times, \
it was the age of wisdom, \
it was the age of foolishness, \
it was the epoch of belief,\
it was the epoch of incredulity, \
it was the season of Light,it was the season of Darkness, \
it was the spring of hope, it was the winter of despair,\
we had everything before us,we had nothing before us, \
we were all going direct to Heaven, \
we were all going direct the other way--in short,\
the period was so far like the present period, \
that some of its noisiest authorities insisted on its being received,\
for good or for evil, in the superlative degree of comparison only.'''
# 这里先去掉字符串中的标点、空格,再统计字母出现的频数
s_str = s_str.replace(' ','').replace(',','').replace('.','').replace('-','')
r = Counter(s_str)
# Counter({'e': 69, 't': 48, 'o': 44, 'i': 44, 's': 42, 'a': 28, 
# 'h': 27, 'r': 27, 'n': 22, 'w': 21, 'f': 19, 'g': 13, 'd': 13,
#  'l': 11, 'p': 10, 'c': 7, 'b': 5, 'm': 5, 'u': 5, 'v': 5, 
#  'y': 4, 'k': 2, 'I': 1, 'L': 1, 'D': 1, 'H': 1})

CounterProvides a number of useful functions, one of which is most_common(n=None)used for statistical appear most frequently in a (default) or several (designated n) elements:

# 接上面的例子
r.most_common(2)    # [('e', 69), ('t', 48)],注意这是个列表,其中的元素是元组

elements()Returns an iterator object that will iterate over the elements in accordance with the number of occurrences:

v_str = 'Each thing, as far as it can by its own power, strives to persevere in its being' 
v_cnt = Counter(v_str.replace(' ','').replace(',',''))
sorted(v_cnt.elements())  # 返回一个列表,将v_str中元素按顺序迭代出现的次数

Reference source code can be found in the definition of Counterclass, also encapsulates __add__, __sub__, __iadd__, __isub__, __and__, __or__and several special methods, so Countercan achieve similar subtraction between objects, and calculates post (where the sample source code directly references) :

# 对结果进行加减
Counter('abbb') + Counter('bcc')    # Counter({'b': 4, 'c': 2, 'a': 1})
Counter('abbbc') - Counter('bccd')  # Counter({'a': 1, 'b': 2})
c = Counter('abbbc')
c -= Counter('bccd')  # Counter({'b': 2, 'a': 1})
# 并、交运算
Counter('abbb') | Counter('bcc')  # Counter({'b': 3, 'c': 2, 'a': 1})
Counter('abbb') & Counter('bcc')  # Counter({'b': 1})
4 collections.OrderedDict(dict)

OrderedDict in the dictionary definition, while maintaining the linked list element into a sequence of records, making the dictionary has become the order of the order, normal iteration will follow the LIFO (last-out) is:

from collections import OrderedDict

a_odct = OrderedDict()
a_odct['a'] = 'a101'
a_odct['b'] = 'a102'
a_odct['c'] = 'a103'
for k, v in a_odct.items():
    print(k, ':', v)
    
# a : a101
# b : a102
# c : a103

Because OrderedDict in the dictionary with ordering and can stack operations like lists, use popitem(last=True)can be achieved:

a_odct = OrderedDict()
a_odct['a'] = 'a101'
a_odct['b'] = 'a102'
a_odct['c'] = 'a103'
a_odct['d'] = 'b101'

a_odct.popitem(last=False)   # ('a', 'a101')

popitem()In the lastparameter defaults to True, you can achieve the LIFO (last out, Last In First Out), when set to False, it can achieve FIFO (First In First Out, First Input First Output).
Another method move_to_end(key, last=True), can be a key to the development of key moves to the head or tail, lastthe default element parameter is True, to be developed to move the tail:

a_odct = OrderedDict()
a_odct['a'] = 'a101'
a_odct['b'] = 'a102'
a_odct['c'] = 'a103'
a_odct['d'] = 'b101'
a_odct.move_to_end('b')  # 原字典键变为 OrderedDict([('a', 'a101'), ('c', 'a103'), ('d', 'b101'), ('b', 'a102')])
5 collections.defaultdict([default_factory[, …]])

And namedtuple similar, defaultdict is a factory function for ordinary dictionary, when key does not exist, it will happen KeyError:

from collections import defaultdict
a_dct = {'a':'a101', 'b':'a102','c':'a103', 'd':'b101'}
a_dct['r']   # KeyError: 'r'

For the key does not exist, and can get()set the default value method. defaultdictIt provides another way to solve the problem KeyError, namely dictionary set a default value, returns a default result when the key is not present.
defaultdictIs a subclass of dict, at initialization need to provide a type or without parameters can call the function as an argument, if it is a function, then the defaultdictdefault value of the dictionary will be a return value of the function:

from collections import defaultdict

a_dft = defaultdict(int)
a_dft['a']  # 0,类似地,如果传入的参数是list,将会返回[]
from random import randint
from collections import defaultdict

a_dfd = defaultdict(lambda: randint(0, 100)) # 注意这里作为传入参数的匿名函数,没有参数
a_dfd['r'] = 23
a_dfd['s'] = 41
a_dfd['a']  # 会返回一个0和100之间的随机整数

The following example of this statistics the number of times contribute to a better understanding of defaultdictthe role, which is that often the Counterreason for comparison.

cnt_lst = ['c','b','a','c','d','a','d','c','d','a']
# 这种方式会返回KeyError,因为字典中没有c这个键
cnt = {}
for cr in cnt_lst:
    cnt[cr] +=1
# 这种方式稍改进了上面的逻辑,增加了对键是否存在的判断
cnt = {}
for cr in cnt_lst:
    if cr not in cnt:
        cnt[cr] = 1
    else:
        cnt[cr] += 1
cnt
# 利用字典中setdefault方法也能实现,dict.setdefault(key, default=None)方法,如果key存在则返回值,如果不存在则设为默认值
cnt = {}
for cr in cnt_lst:
    cnt[cr] = cnt.setdefault(cr, 0) + 1
# 如果一开始就将cnt设为一个defaultdict,则可以很简单实现
cnt = defaultdict(int)
for cr in cnt_lst:
    cnt[cr] += 1
6 collections.ChainMap(*maps)

ChainMapCan be called "combination dictionary," according to official documents presented, "a ChainMap class in order to link multiple maps together quickly, so that they can be handled as a unit. It is usually a new dictionary and multiple calls than creating update () is much faster. "
ChainMaptakes any number of these dictionaries and dictionary" together "together with the reason quoted together (and) for ChainMapthe role, because in essence it did not take a few dictionary do merge, but created a mapping relationship, then it will not change the content of the original dictionary, but also to achieve more rapid merge operation:

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
print(r['b'])                        # 2
print(r['a'])                        # 1

The last r['a']example can be found when the presence of the same key (Key), ChainMapto find a value from the sequence according to their own.
Let's look at changing the map of the dictionary will build ChainMapwhat impact:

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
r['s'] = 11                          # ChainMap({'a': 1, 'b': 2, 's': 11}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6}),此时a_dct将被改变成为{'a': 1, 'b': 2, 's': 11}
b_dct.pop('c')                       # ChainMap({'a': 1, 'b': 2, 's': 11}, {'d': 4}, {'a': 5, 'c': 6})
c_dct.update({'c':9})                # ChainMap({'a': 1, 'b': 2, 's': 11}, {'d': 4}, {'a': 5, 'c': 9})

It can be found, if the change of the original dictionary is ChainMapaccordingly changed; if ChainMapadded to the key, this set of data is to be added in ChainMapthe first dictionary, and the dictionary will change the original data.
ChainMapThere are three mapmethods: new_child(m=None), parents, . mapReturn "Mapping a list can be updated. This list is first searched in the order last search organization. It is the only state storage, it can be modified."

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
print(r.maps)                        # [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6}]
r.maps.append({'w':3})
print(r)                             # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6}, {'w': 3})

new_child(m=None)It implements the "return a new ChainMapclass contains a new map (Map), followed by all the map (Map) of the present example. If m is specified, it becomes different new instance, before all of the mappings is to add m, if not specified, together with an empty dictionary, in which case a d.new_child()call is equivalent to ChainMap({}, *d.maps). this method is used to create the context does not change any values in the parent mapping. "

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
s = r.new_child()                    # s将是ChainMap({}, {'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6}),r则不会发生改变
s['w'] = 12
print(s)                             # ChainMap({'w': 12}, {'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})

If the parameter m is specified, it will be added to ChainMapthe top.

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
s = r.new_child(c_dct)               # ChainMap({'a': 5, 'c': 6}, {'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6}),r则不会发生改变

parents"Returns a new property ChainMapthat contains mappings for all of the current instance, in addition to the first one. This will skip the first map when searching."

from collections import ChainMap

a_dct = {'a':1, 'b':2}
b_dct = {'c':3, 'd':4}
c_dct = {'a':5, 'c':6}
r = ChainMap(a_dct, b_dct, c_dct)    # ChainMap({'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'a': 5, 'c': 6})
s = r.parents                        # s将成为ChainMap({'c': 3, 'd': 4}, {'a': 5, 'c': 6}),r则不会发生改变
v = r.parents.parents                # v将变成ChainMap({'a': 5, 'c': 6}), r和s不会发生变化

Guess you like

Origin blog.csdn.net/github_37999869/article/details/98665855