Talk about Python's bisect module

The bisect module contains two main functions (bisect and insort), which use binary search algorithms internally to find elements and insert elements in an ordered sequence.

bisect /baɪˈsekt/
to divide sth into two equal parts

1 bisect function

Luciano Ramalho gave an example of finding a needle in a haystack to illustrate how to use bisect.bisect and bisect.bisect_left.

HAYSTACK = [1, 4, 5, 6, 8, 12, 15, 20, 21, 23, 23, 26, 29, 30]
NEEDLES = [0, 1, 2, 5, 8, 10, 22, 23, 29, 30, 31]

ROW_FMT = '{0:2d} @ {1:2d}    {2}{0:<2d}'


def demo(bisect_fn):
    for needle in reversed(NEEDLES):
        position = bisect_fn(HAYSTACK, needle)
        offset = position * '  |'
        print(ROW_FMT.format(needle, position, offset))


if __name__ == '__main__':

    if sys.argv[-1] == 'left':
        bisect_fn = bisect.bisect_left
    else:
        bisect_fn = bisect.bisect

    print('DEMO:', bisect_fn.__name__)
    print('haystack ->', ' '.join('%2d' % n for n in HAYSTACK))
    demo(bisect_fn)

operation result:

DEMO: bisect_right
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |29
23 @ 11      |  |  |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  5      |  |  |  |  |8 
 5 @  3      |  |  |5 
 2 @  1      |2 
 1 @  1      |1 
 0 @  0    0 

One feature of Python functions is that they can take function names as arguments, such as bisect_fn in the example. Doing so makes the function more flexible. We can use the function name as a program running parameter and load it dynamically.

HAYSTACK is a pile of haystacks, and NEEDLES is a pile of needles. Finding a needle in a haystack is essentially looking for a certain number in a sequence of numbers.

The custom demo(bisect_fn) function first calculates the position, then uses the position to calculate the required number of separators as the printing offset, and finally prints it out in the defined format.

str.format() is used to format a string, it can specify the position of the actual parameter. In the syntax similar to {0:2d}, 0 means the first input parameter, :2d means the total length, if not enough, a space is used as a placeholder; d means a decimal signed integer.

Alignment can also be set in str.format() format. ^, <,> mean center, left, and right, respectively. So {0:<2d} means the first input parameter is left-justified and occupies a two-digit decimal signed integer.

__name__It is a built-in class attribute of python, exists in a python program, and represents the name of the corresponding program. If it is the main thread, then its built-in name is __main__.

If the left parameter is added when running the program, the bisect_left function will be called inside the custom function of the program. The bisect function is actually an alias of the bisect_right function.

The difference between the bisect_left function and the bisect function is:

  1. The bisect_left function returns the position of the element equal to the inserted element in the original sequence. If a new element is inserted, the new element will be placed in front of the element equal to it .
    2.bisect function returns with the position of the original sequence after being inserted into the element's equal, if inserting a new element, the new element is then placed with its elements equal to the back .

Result of bisect_left function operation:

DEMO: bisect_left
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 12      |  |  |  |  |  |  |  |  |  |  |  |29
23 @  9      |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  4      |  |  |  |8 
 5 @  2      |  |5 
 2 @  1      |2 
 1 @  0    1 
 0 @  0    0 

The official python documentation also gives a sample program that uses the bisect function to output test scores:

def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    i = bisect.bisect(breakpoints, score)
    return grades[i]

if __name__ == '__main__':
  results = [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
    logging.info('results -> %s', results)

operation result:

INFO - results -> ['F', 'A', 'C', 'C', 'B', 'A', 'A']

The custom grade() defines three parameters:

parameter name Description
score Test score
breakpoints The boundary value of the score level; here is divided into 5 levels; 90 and above, 80 ~ 89, 70 ~ 79, 60 ~ 69 and below 60.
grades Evaluation points range.

The grade() function first finds out its position through the bisect() function according to the incoming score, and then passes this position into the grades sequence to get the evaluation score.

In the main thread, iterate the sequence representing the student's grades through the for in grammar, pass the grades into the grade() function to calculate the evaluation score, and finally output it all at once through the sequence.

2 insort function

Because sorting is a time-consuming task, for an ordered sequence, it is best to keep the order when adding an element. The insort function will ensure that the sequence is always ordered when inserting.

    SIZE=10
    my_list=[]
    for i in range(SIZE):
        new_item=random.randrange(SIZE*3)
        bisect.insort(my_list,new_item)
        print('%2d -> '% new_item,my_list)

operation result:

18 ->  [18]
 8 ->  [8, 18]
21 ->  [8, 18, 21]
 5 ->  [5, 8, 18, 21]
19 ->  [5, 8, 18, 19, 21]
13 ->  [5, 8, 13, 18, 19, 21]
20 ->  [5, 8, 13, 18, 19, 20, 21]
 4 ->  [4, 5, 8, 13, 18, 19, 20, 21]
15 ->  [4, 5, 8, 13, 15, 18, 19, 20, 21]
 2 ->  [2, 4, 5, 8, 13, 15, 18, 19, 20, 21]

randrange() will return a random number within the given parameter range, but does not include the boundary value.

As you can see, every time you insert, the sequence always remains in order.

print('%2d -> '% new_item,my_list)The %s formatting syntax is adopted, %2d defines the format of the new_item value, and my_list will automatically hang after the format. So there is no parentheses after the second percent sign, and the parameters that need to be formatted are circled.

Insort also has a brother called insort_left, and the bottom layer uses bisect_left. The insort_left function will place the new element in front of its equal element.


In addition, the bisect function and the insort function have two optional parameters (lo and hi), which can be used to narrow the range of the sequence to be searched. The default value of lo is 0, and the default value of hi is the length of the sequence.

Guess you like

Origin blog.csdn.net/deniro_li/article/details/108894298