Avoid using the "+" operator to concatenate strings in python

Translated from Do Not Use “+” to Join Strings in Python

Avoid using the "+" operator to concatenate strings in python

  When I started using python, I often used it +to concatenate strings because it was very intuitive —just like many other programming languages ​​such as java concatenate.
  However, I soon discovered that many developers seem to prefer to use .join()instead +. This article will introduce the difference between the two and why you should avoid using it +.

Start

  As a beginner or a developer who has switched to python from other languages, it is easy to write the code in the following form:

str1 = "I love "
str2 = "python."
print(str1 + str2)
# I love python

  As you use python more and more, you may realize that some people prefer to use join():

str1 = "I love "
str2 = "python."
print(''.join([str1, str2])
# I love python

  To be honest, when I saw the above writing for the first time, I thought about why such an unintuitive and seemingly abstract way is needed.

Concatenate multiple strings

  However, once I needed to concatenate multiple strings in a list.

strs = ['Life', 'is', 'short,', 'I', 'use', 'Python']

  At the beginning, I wrote the following code:

strs = ['Life', 'is', 'short,', 'I', 'use', 'Python']

def join_strs(strs):
    result = ''
    for s in strs:
        result +=' ' + s
    return result[1:]

join_strs(strs)
# 'Life is short, I use Python'

  In this example, I have to use a for loop to concatenate strings one by one. At the same time, the string in the loop must be preceded by a space to ensure that each string in the final result has a space in front of it-of course the first string is an exception. Of course, you can determine the index to ensure that spaces will not be added to the string when index=0. But anyway you need a for loop to do these things.
  After writing this code, I will remember that I have seen the .join()method, maybe it's time to use it:

strs = ['Life', 'is', 'short,', 'I', 'use', 'Python']

def join_strs_better(strs):
    return ' '.join(strs)

join_strs_better(strs)
# 'Life is short, I use Python'

  How easy it is! All problems are solved with just one line of code. Since the .join()method is called by a string object-this string object uses all the strings in the list to generate a new string-so you don't need to worry about adding an extra space at the beginning.
  You don’t really think this is +the only reason we avoid it, do you? The next section will introduce more differences between the two!

join()The logic behind the method

  We compare the performance of the two methods and evaluate them by using them in Jupyter Notebook %timeit:
Insert picture description here

This code is run on the translator’s notebook, cpui7-7700hq, if running on your own computer, the performance of the cpu will determine the length of time)


  The results based on 100k experiments are trustworthy. Obviously, the join()method is more than +4 times faster than using it .
  Why is this happening?
  The following is +a conceptual diagram when used


Using + operator and for-loop to join strings in a list

This shows the +steps of the for loop and the operator:

  1. Find the next string in the list each time through the loop
  2. The python executor executes the statement result += ' ' + sand allocates the indicated ' 'memory
  3. The python executor finds that the space needs to be connected with the string, so it will apply for a memory address for the found string. The string in the first loop is "Life"
  4. For each loop, the executor needs to allocate memory addresses twice, once to a space and once to the string of the current loop
  5. A total of 12 memory allocations

join()What happens when that method is called?
Using “join()” method to join strings in a list

  1. The executor counts how many strings (6) in the list
  2. Through the executor in the previous step, it is known that the string connection needs to be performed 6-1=5 times
  3. The results of the previous two steps let the executor know that a total of 11 memory spaces need to be allocated, so it will call 11 memory spaces in advance at one time
  4. Connect the strings in order and return the result

  Obviously, the main difference between the two methods is the performance improvement caused by the number of memory calls.
  Imagine concatenating six strings. The join()method is already +four times faster. What if we need to concatenate a large number of strings? Obviously there will be a greater difference between the two.

to sum up

  In this short article, I will be in python +and join()compared two methods of connection strings. Obviously the latter's performance is much better than the former.
  Learning a programming language is usually a long curve, but Python makes it easier for beginners-which is undoubtedly great. But when we get started and start using Python, we should not stop at the way we used Python before. In fact, the difference between masters and ordinary developers often comes from the understanding of the details.
   Let us continue to discover more details of Python programming to make ourselves more proficient in Python!

word

  • comparison
  • intuitive /ɪn'tuɪtɪv/ intuitive
  • conceptual graph
  • allocations /ˌælə'keʃən/call
  • %timeit jupyter notebook timing

Guess you like

Origin blog.csdn.net/qq_34769162/article/details/108902354