A possible reason why the Python program runs too slowly is that the built-in methods are not called as much as possible. The following 5 examples demonstrate how to use the built-in methods to improve the performance of the PythGon program .
1. Array sum of squares
Enter a list and ask to calculate the sum of the squares of the numbers in the list. The final performance is improved by 1.4 times.
First create a list of length 10000.
arr = list(range(10000))
1.1 The most conventional way of writing
The while loop iterates over the list to find the sum of squares. The average run time is 2.97 ms.
def sum_sqr_0(arr):
res = 0
n = len(arr)
i = 0
while i < n:
res += arr[i] ** 2
i += 1
return res
%timeit sum_sqr_0(arr)
2.97 ms ± 36.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.2 for range instead of while loop
Avoid i += 1
the extra overhead of variable type checking. The average run time is 2.9 ms.
def sum_sqr_1(arr):
res = 0
for i in range(len(arr)):
res += arr[i] ** 2
return res
%timeit sum_sqr_1(arr)
2.9 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.3 for x in arr
replacefor range
Avoid arr[i]
the extra overhead of variable type checking. The average run time is 2.59 ms.
def sum_sqr_2(arr):
res = 0
for x in arr:
res += x ** 2
return res
%timeit sum_sqr_2(arr)
2.59 ms ± 89 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.4 The sum function applies the map function
Average elapsed time 2.36 ms
def sum_sqr_3(arr):
return sum(map(lambda x: x**2, arr))
%timeit sum_sqr_3(arr)
2.36 ms ± 15.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.5 The sum function applies generator expressions
If the generator expression is used as a parameter of a function, the () can be omitted. The average run time is 2.35 ms.
def sum_sqr_4(arr):
return sum(x ** 2 for x in arr)
%timeit sum_sqr_4(arr)
2.35 ms ± 107 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.6 The sum function applies list comprehension
The average run time is 2.06 ms.
def sum_sqr_5(arr):
return sum([x ** 2 for x in arr])
%timeit sum_sqr_5(arr)
2.06 ms ± 27.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2. String concatenation
Input a list and require the first 3 characters of the strings in the list to be concatenated into a string. The final performance is improved by 2.1 times.
First create a list that generates 10000 strings of random length and content.
from random import randint
def random_letter():
return chr(ord('a') + randint(0, 25))
def random_letters(n):
return "".join([random_letter() for _ in range(n)])
strings = [random_letters(randint(1, 10)) for _ in range(10000)]
2.1 The most conventional way of writing
The while loop traverses the list and splices the strings. The average run time is 1.86 ms.
def concat_strings_0(strings):
res = ""
n = len(strings)
i = 0
while i < n:
res += strings[i][:3]
i += 1
return res
%timeit concat_strings_0(strings)
1.86 ms ± 74.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.2 for range
instead of while loop
Avoid i += 1
the extra overhead of variable type checking. The average run time is 1.55 ms.
def concat_strings_1(strings):
res = ""
for i in range(len(strings)):
res += strings[i][:3]
return res
%timeit concat_strings_1(strings)
1.55 ms ± 32.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.3 for x in strings
replacefor range
Avoid the extra overhead of variable type checking for strings[i] . The average run time is 1.32 ms.
def concat_strings_2(strings):
res = ""
for x in strings:
res += x[:3]
return res
%timeit concat_strings_2(strings)
1.32 ms ± 19.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.4. join
Method apply generator expression
The average run time is 1.06 ms.
def concat_strings_3(strings):
return "".join(x[:3] for x in strings)
%timeit concat_strings_3(strings)
1.06 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.5. join
Method applies list comprehension formula
The average run time is 0.85 ms.
def concat_strings_4(strings):
return "".join([x[:3] for x in strings])
%timeit concat_strings_4(strings)
858 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3. Filter odd numbers
Enter a list and ask to filter out all odd numbers in the list. The final performance is improved by 3.6 times.
First create a list of length 10000.
arr = list(range(10000))
3.1 The most conventional way of writing
Create an empty list res, while loop through the list, and append odd numbers to res. The average run time is 1.03 ms.
def filter_odd_0(arr):
res = []
i = 0
n = len(arr)
while i < n:
if arr[i] % 2:
res.append(arr[i])
i += 1
return res
%timeit filter_odd_0(arr)
1.03 ms ± 34.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.2 for range
instead of while loop
Avoid i += 1
the extra overhead of variable type checking. The average run time is 0.965 ms.
def filter_odd_1(arr):
res = []
for i in range(len(arr)):
if arr[i] % 2:
res.append(arr[i])
i += 1
return res
%timeit filter_odd_1(arr)
965 µs ± 4.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.3 for x in arr
Replacementfor range
Avoid arr[i]
the extra overhead of variable type checking. The average run time is 0.430 ms.
def filter_odd_2(arr):
res = []
for x in arr:
if x % 2:
res.append(x)
return res
%timeit filter_odd_2(arr)
430 µs ± 9.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.4 list apply filter function
The average runtime is 0.763 milliseconds. Note that the filter function is very slow, which is very weak in Python 3.6.
def filter_odd_3(arr):
return list(filter(lambda x: x % 2, arr))
%timeit filter_odd_3(arr)
763 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.5 list apply generator expression
The average run time is 0.398 ms.
def filter_odd_4(arr):
return list((x for x in arr if x % 2))
%timeit filter_odd_4(arr)
398 µs ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.6 Conditional list comprehensions
The average run time is 0.290 ms.
def filter_odd_5(arr):
return [x for x in arr if x % 2]
%timeit filter_odd_5(arr)
290 µs ± 5.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4. Adding two arrays
Input two lists of the same length, ask to calculate the sum of the numbers in the corresponding positions of the two lists, and return a list with the same length as the input. The final performance is improved by 2.7 times.
First generate two lists of length 10000.
arr1 = list(range(10000))
arr2 = list(range(10000))
4.1 The most conventional way of writing
Create an empty list res, while loop through the list, and append the sum of the elements corresponding to the two lists to res. The average run time is 1.23 milliseconds.
def arr_sum_0(arr1, arr2):
i = 0
n = len(arr1)
res = []
while i < n:
res.append(arr1[i] + arr2[i])
i += 1
return res
%timeit arr_sum_0(arr1, arr2)
1.23 ms ± 3.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.2 for range
Replacing the while loop
Avoid i += 1
the extra overhead of variable type checking. The average run time is 0.997 ms.
def arr_sum_1(arr1, arr2):
res = []
for i in range(len(arr1)):
res.append(arr1[i] + arr2[i])
return res
%timeit arr_sum_1(arr1, arr2)
997 µs ± 7.42 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.3 for i, x in enumerate
Replacementfor range
Partially avoids arr[i]
the overhead of variable type checking. The average run time is 0.799 ms.
def arr_sum_2(arr1, arr2):
res = arr1.copy()
for i, x in enumerate(arr2):
res[i] += x
return res
%timeit arr_sum_2(arr1, arr2)
799 µs ± 16.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.4 for x, y in zip
replacefor range
Avoid arr[i]
the extra overhead of variable type checking. The average run time is 0.769 ms.
def arr_sum_3(arr1, arr2):
res = []
for x, y in zip(arr1, arr2):
res.append(x + y)
return res
%timeit arr_sum_3(arr1, arr2)
769 µs ± 12.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.5 List comprehension apply zip
The average runtime is 0.462 milliseconds.
def arr_sum_4(arr1, arr2):
return [x + y for x, y in zip(arr1, arr2)]
%timeit arr_sum_4(arr1, arr2)
462 µs ± 3.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
5. Number of identical elements in two lists
Enter two lists and ask to count the number of identical elements in the two lists. The elements in each list are not repeated. The final performance was improved by 5000 times.
First create two lists and shuffle the order of the elements.
from random import shuffle
arr1 = list(range(2000))
shuffle(arr1)
arr2 = list(range(1000, 3000))
shuffle(arr2)
5.1 The most conventional way of writing
The while loop is nested to determine arr1[i]
whether the element is equal arr2[j]
, and the average running time is 338 milliseconds.
def n_common_0(arr1, arr2):
res = 0
i = 0
m = len(arr1)
n = len(arr2)
while i < m:
j = 0
while j < n:
if arr1[i] == arr2[j]:
res += 1
j += 1
i += 1
return res
%timeit n_common_0(arr1, arr2)
338 ms ± 7.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
5.2 for range
Replacing the while loop
Avoid i += 1
the extra overhead of variable type checking. The average run time is 233 ms.
def n_common_1(arr1, arr2):
res = 0
for i in range(len(arr1)):
for j in range(len(arr2)):
if arr1[i] == arr2[j]:
res += 1
return res
%timeit n_common_1(arr1, arr2)
233 ms ± 10.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
5.3 for x in arr
Replacementfor range
Avoid arr[i]
the extra overhead of variable type checking. The average run time was 84.8 ms.
def n_common_2(arr1, arr2):
res = 0
for x in arr1:
for y in arr2:
if x == y:
res += 1
return res
%timeit n_common_2(arr1, arr2)
84.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
5.4 Using if x in arr2
instead of inner loops
The average run time is 24.9 ms.
def n_common_3(arr1, arr2):
res = 0
for x in arr1:
if x in arr2:
res += 1
return res
%timeit n_common_3(arr1, arr2)
24.9 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
5.4 Using Faster Algorithms
Sort the array with .sort
the method, and then perform a single-level loop traversal. Reduce the time complexity from O(n2)
to O(nlogn)
, the average running time is 0.239 milliseconds.
def n_common_4(arr1, arr2):
arr1.sort()
arr2.sort()
res = i = j = 0
m, n = len(arr1), len(arr2)
while i < m and j < n:
if arr1[i] == arr2[j]:
res += 1
i += 1
j += 1
elif arr1[i] > arr2[j]:
j += 1
else:
i += 1
return res
%timeit n_common_4(arr1, arr2)
329 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
5.5 Use better data structures
Convert the array to a set and find the length of the intersection. The average run time is 0.067 ms.
def n_common_5(arr1, arr2):
return len(set(arr1) & set(arr2))
%timeit n_common_5(arr1, arr2)
67.2 µs ± 755 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)