python-进阶教程-heapq模块

0.摘要

本文主要介绍heapq模块提供的主要函数方法。

1.nlargest()和nsmallest

从字面意思就可以知道

heapq.nlargest(n, iterable, key=None) ：返回可枚举对象中的 n 个最大值
heapq.nsmallest(n, iterable, key=None) ：返回可枚举对象中的 n 个最小值。

import heapq

nums = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(heapq.nlargest(3,nums))
print(heapq.nsmallest(3,nums))
#result:[9, 8, 7]
#result:[0, 1, 2]

相比于max()和min()函数，nlargest()和nsmallest()能够按递增或递减顺序给出最大或最小的n个值。

另外，nlargest()和nsmallest()更强大的地方在于能够接受一个参数key，从而允许了这两个函数能够工作在更为复杂的数据结构上。

import heapq

portfolio = [
   {'name': 'IBM', 'shares': 100, 'price': 91.1},
   {'name': 'AAPL', 'shares': 50, 'price': 543.22},
   {'name': 'FB', 'shares': 200, 'price': 21.09},
   {'name': 'HPQ', 'shares': 35, 'price': 31.75},
   {'name': 'YHOO', 'shares': 45, 'price': 16.35},
   {'name': 'ACME', 'shares': 75, 'price': 115.65}
]

cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

print(cheap)
print(expensive)
#result:[{'name': 'YHOO', 'shares': 45, 'price': 16.35}, {'name': 'FB', 'shares': 200, 'price': 21.09}, {'name': 'HPQ', 'shares': 35, 'price': 31.75}]
#result:[{'name': 'AAPL', 'shares': 50, 'price': 543.22}, {'name': 'ACME', 'shares': 75, 'price': 115.65}, {'name': 'IBM', 'shares': 100, 'price': 91.1}]

通过key参数，可以以字典中的某个量为依据，找出最大或最小的n个值。

注意：key参数并非一定是数值，字符串也同样可以（按照ASCII码排序）。

提示：max()和min()函数也具有key参数，这里替换为max()和min()函数即可得到最大和最小的值。

2.heappush()和heappop

heapq.heappush(heap, item)：将元素添加到heap中
heapq.heappop(heap)：返回 root 节点，即 heap 中最小的元素

使用heappush()和heappop实现一个优先队列例子，即每次pop操作，弹出的是优先级最大的与元素。

注意：heapq.heappop(）弹出的是优先级最小的元素，这里通过对优先级取相反数颠倒优先级。

import heapq

class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0

    def push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        self._index += 1

    def pop(self):
        return heapq.heappop(self._queue)[-1]

# Example use
class Item:
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return 'Item({!r})'.format(self.name)

q = PriorityQueue()
q.push(Item('foo'), 1)
q.push(Item('bar'), 5)
q.push(Item('spam'), 4)
q.push(Item('grok'), 1)

print("Should be bar:", q.pop())
print("Should be spam:", q.pop())
print("Should be foo:", q.pop())
print("Should be grok:", q.pop())
#result:Should be bar: Item('bar')
#result:Should be spam: Item('spam')
#result:Should be foo: Item('foo')
#result:Should be grok: Item('grok')

这里引入index是为了防止优先级相同的情况，根据index进行排序。

元组(-priority, item)在priority相同的情况下，元组会对item进行比较。

但是，Item实例在python中是不能进行比较运算的，强制比较会导致报错。

加入index就可以避免这一问题，因为每一个元素的index都是不同的。这样，在priority相同的情况下，会比较index大小，并且由于index必然不相等，所以不存在进一步比较item的情况，从而避免了报错。

python-进阶教程-heapq模块

猜你喜欢