[Data Structure] Several common algorithms for finding elements I - linear search, binary search

This article has a total of 1310 words and the estimated reading time is 5 minutes.

Table of contents

Find

Method 1: Linear search

Method 2: Binary search

Find

Suppose we now have a sorted array, see below. The current task is to find out whether an element exists in the array.

/img/posts/skip-list/1simplearray.png

We need to find the element with value 57 in this array.

Method 1: Linear search

As the name suggests, in linear search you need to traverse the list from beginning to end and compare whether the value in each item is what you are looking for. The traversal ends when you find a matching value.

Find the value 57

When a value greater than the value you are currently looking for is found, the search can end (because subsequent values will only be greater than the value you are looking for).

For example, to find the value 35

Is this a good approach?

This solution is feasible. But if you pay a little more attention, you will find that it does not take full advantage of the ordering properties of the array. In the worst case, the time complexity of this algorithm will be O(N) - when the element we are looking for is no longer in the array, then we will have to compare the value we are looking for with every element in the array.

Can the method be designed better?

Of course, try using binary search

Method 2: Binary search

The concept of binary search is simple. Think about how you normally look up words in an English dictionary. For example, if you want to find the word "quirky" in the dictionary, you don't have to turn from the first page to the last page in the dictionary to find the word one by one, because the dictionary has already arranged each word in alphabetical order. Sorted. You can quickly skip the page numbers before and after the letter q.

Back to our example, let's first look at the value 48 in the middle of the array. If this value matches our target value, then we can end the search directly.

But what if we are not looking for the value 48? Then we are going to use the value in the middle to compare it with our target value. If our target value is smaller than this, then we need to find a smaller value than this, that is, starting from 48 and searching to the left without considering the value on the right.

Find the value 33

With binary search, we can shorten the search sequence and reduce the number of values to be compared to half. In addition, we can also continue to repeat this process, that is, by continuously selecting the "intermediate value" and comparing it with the target value to determine whether the result matches. If not, then continue to determine whether to search left or right, so that we can Continuously narrow down the sequence to be searched and reduce the number of queries.

What is the current time complexity?

The time complexity of the algorithm is determined by the number of comparisons. If you understand this algorithm, you should know that the time complexity of this algorithm is $O(log_2n)$ . For the specific derivation formula, please refer to big O - What factors will affect the time complexity?

This algorithm makes full use of the ordering characteristics of arrays. This algorithm can take full advantage of performance when a large number of values are stored sequentially in an array.

So is this a perfect solution?

Obviously, this method is indeed very labor-saving when searching. But if we want to insert or delete elements, sequential arrays are not suitable for this situation.

for example:

/img/posts/skip-list/8arrayinsertion.png

/img/posts/skip-list/9arraydeletion.png

As you might imagine, insertion and deletion operations come in handy if there is extra space on the right side of the array; otherwise, dynamic memory allocation must be called on each insertion and the entire array copied to the newly allocated array. . The List data structure in Python allocates List in a similar manner, leaving capacity at the end of the sequence for future expansion. If we run out of space in the List, Python allocates a new, larger List, copies every value from the old List to the new List, and then deletes the old List. (This is why insertion into a Python list is O(1) time complexity )

Is there a better way to construct a data structure that performs better during insertion and deletion?

Of course there is, that is a linked list!

Translating…