Chapter 8: Strings in Python

1. The resident mechanism of strings

1.1 Strings

A string is a basic data type in Python and is an immutable sequence of characters

1.2 What is the string resident mechanism

  • Only one copy of the same and immutable string is saved. Different values ​​​​are stored in the resident pool of the string. Python’s resident mechanism only keeps one copy of the same string. When the same string is created later , will not open up a new space, but assign the address of the string to the newly created variable

icon
image-20230505174515883

code demo

"""
    字符串的驻留机制
"""
a = 'Python'
b = "Python"
c = '''Python'''
print(a, id(a))  # 内存地址相同
print(b, id(b))
print(c, id(c))

1.3 Several situations of the resident mechanism (interactive mode)

The intern method in sys forces 2 strings to point to the same object

PyCharm optimizes strings

  • When the length of the string is 0 or 1
  • string matching the identifier
  • Strings are only resident at compile time, not runtime
  • Integer numbers between [-5,256]

Second, the common operation of the string

2.1 String query operation

common method

method name effect
index() Find the position of the first occurrence of the substring, if the searched substring does not exist, an exception ValueError will be thrown
rindex() Find the position of the last occurrence of the substring, if the searched substring does not exist, an exception ValueError will be thrown
find() Find the first occurrence of the substring, if the searched substring does not exist, return -1
rfind() Find the position of the last occurrence of the substring, if the searched substring does not exist, return -1

code demo

s = 'hello,hello'

print('1.', s.index('lo'))
print('2.', s.find('lo'))
print('3.', s.rindex('lo'))
print('4.', s.rfind('lo'))

# print('5.', s.index('lo0')) # 抛出异常
print('6.', s.find('lo0'))
print('7.', s.rfind('lo0'))

2.2 Case conversion of strings

will generate a new string object

Even though the converted string is the same as before, the id is still different

method name effect
upper() Convert all characters in a string to uppercase
lower() Convert all characters in a string to lowercase
swapcase() Convert all uppercase letters in a string to lowercase letters, and convert all lowercase letters to uppercase letters
capitalize() Convert the first character to uppercase and the rest to lowercase
title() Convert the first character of each word to uppercase and the single remaining letter to lowercase

code demo

s = 'hello,python'
print('0.', s, id(s))

a = s.upper()
print('1.', a, id(a))

b = s.lower()
print('2.', b, id(b))

s2 = 'hello,Python'
c = s2.swapcase()
print('3.', c)

d = s2.title()
print('4.', d)

2.3 Operation of the content of the string

method name effect
center() Center
specifies the width for the first parameter, and the second parameter specifies the filler (optional, the default is space).
If the set width is smaller than the actual width, the original string will be returned
light() Left alignment
The first parameter specifies the width, and the second parameter specifies the filler (optional, the default is space).
If the set width is smaller than the actual width, the original string will be returned
rjust() Right alignment
The first parameter specifies the width, and the second parameter specifies the filler (optional, default is space).
If the set width is smaller than the actual width, the original string will be returned
zfill() Align the left and
fill the right with 0. This method only accepts one parameter, which is used to specify the width of the string.
If the set width is less than or equal to the length of the string, the original string will be returned.

code demo

s = 'hello,Python'
print('原字符:', s)
print('中对齐:', s.center(20, '*'))
print('中对齐:', s.center(10, '*'))
print('左对齐:', s.ljust(20, '*'))
print('右对齐:', s.rjust(20, '*'))
print('右对齐:', s.zfill(20))

2.4 String Content Splitting Operation

method name effect
split() Start splitting from the left side of the string. The default splitting character is a space string, and the returned value is a list. The splitting character can be specified
by parameters. The maximum number of splits can be specified by parameters . After the maximum number of splits, the remaining The substring of will be taken alone as part ofsep
maxslpit
rsplit() Start splitting from the right side of the string. The default splitting character is a space string, and the returned value is a list. The splitting character can be specified
by parameters and the maximum number of splits can be specified by parameters . After the maximum number of splits, the remaining The substring of will be taken alone as part ofsep
maxslpit

code demo

s = 'hello world Python'
lst = s.split()
print(lst)
s1 = 'hello|world|Python'
print(s1.split(sep='|'))
print(s1.split(sep='|', maxsplit=1))
print('-------------------------------')
'''rsplit()从右侧开始劈分'''
print(s.rsplit())
print(s1.rsplit('|'))
print(s1.rsplit(sep='|', maxsplit=1))

2.5 Judgment operation of string

method name effect
isidentifier() Determine whether the specified string is a legal identifier
isspace() Determine whether the specified string consists of all blank characters (carriage return, line feed, horizontal tab)
isalpha() Determines whether the specified string consists of all letters
isdecimal() Determine whether the specified string is composed of all decimal numbers
isnumeric() Determines whether the specified string is composed entirely of numbers
isalnum() Determine whether the specified characters are all composed of numbers

code demo

s = 'abc%'
s1 = 'hellopython'
print(s.isidentifier())  # False
print(s1.isidentifier())  # True
print('\t'.isspace())  # True
print('abc'.isalpha())  # True
print('abc1'.isalpha())  # False
print('张三'.isalpha())  # True

print('123'.isdecimal())  # True
print('123四'.isdecimal())  # False

print('123'.isnumeric())  # True
print('123四'.isnumeric())  # True
print('IIIIIIIV'.isnumeric())  # False

print('abc123'.isalnum())  # True
print('123张'.isalnum())  # True
print('123!'.isalnum())  # False

2.6 Other common operations on strings

Function method name effect
string replacement replace() The first parameter specifies the substring to be replaced, and the second parameter specifies the string to replace the substring.
This method returns the string obtained after replacement, and the string before replacement does not change.
You can pass the third parameter when calling this method. Specify the maximum number of replacements
merging of strings join() Combine strings in a list or tuple into one string

code demo

s = 'hello,Python'
print(s.replace('Python', 'Java'))
s1 = 'hello,Python,Python,Python'
print(s1.replace('Python', 'Java', 2))

lst = ['hello', 'java', 'Python']
print('|'.join(lst))
print(''.join(lst))

t = ('hello', 'Java', 'Python')
print(''.join(t))

print('*'.join('Python'))

Third, the comparison of strings

  • operator
>	>=	<	<=	==	!=
  • compare rules

    • First compare the first characters in the two strings, if they are equal, continue to compare the next character, and compare them in turn until the characters in the two strings are not equal, the comparison result is the comparison result of the two strings, All subsequent characters in the two strings will no longer be compared
  • comparison principle

    • When two characters are compared, the ordinal value (original value) is compared, and the ordordinal value of the specified character can be obtained by calling the built-in function.
    • Corresponding to the built-in function ord is a built-in function chr. When calling the built-in function chr, specify ordinal value to get its corresponding character
  • code demo

rint('apple' > 'app')  # True
print('apple' > 'banana')  # False   ,相当于97>98 >False
print(ord('a'), ord('b'))
print(ord('魏'))

print(chr(97), chr(98))
print(chr(39759))

'''
    ==  与is的区别
    ==  比较的是    value   是否相等
    is  比较的是    id  是否相等
'''
a = b = 'Python'
c = 'Python'
print(a == b)  # True
print(b == c)  # True

print(a is b)  # True
print(a is c)  # True
print(id(a))  # 2204259933168
print(id(b))  # 2204259933168
print(id(c))  # 2204259933168

Fourth, the slice operation of the string

Strings are immutable types:

​ Does not have the operation of adding, deleting and modifying

​ Slicing operations will generate new objects

code demo

s = 'hello,Python'
s1 = s[:5]  # 由于没有指定起始位置,所以从0开始切
s2 = s[6:]  # 由于没有指定结束位置,所以切到字符串的最后一个元素
s3 = '!'
newstr = s1 + s3 + s2

print(s1)
print(s2)
print(newstr)
print('--------------------')
print(id(s))
print(id(s1))
print(id(s2))
print(id(s3))
print(id(newstr))

print('------------------切片[start:end:step]-------------------------')
print(s[1:5:1])  # 从1开始截到5(不包含5),步长为1
print(s[::2])  # 默认从0 开始,没有写结束,默认到字符串的最后一个元素 ,步长为2  ,两个元素之间的索引间隔为2
print(s[::-1])  # 默认从字符串的最后一个元素开始,到字符串的第一个元素结束,因为步长为负数
print(s[-6::1])  # 从索引为-6开始,到字符串的最后一个元素结束,步长为1

Five, format string

Two ways to format strings

image-20230506152710676

code demo

"""第一种:% 占位符"""
name = '小米'
age = 20
print('我叫%s,今年%d岁' % (name, age))

"""第二种方法 {} 占位符 """
print('我叫{0},今年{1}岁'.format(name, age))

"""第三种方法 f-string方法"""
print(f'我叫{
      
      name},今年{
      
      age}岁')

representation of precision

print('%10d' % 99)  # 10表示宽度
print('%f' % 3.1415926)
# 保留三位小数
print('%.3f' % 3.1415926)
# 同时设置宽度和精度:总宽度为10,小数点为3位
print('%10.3f' % 3.1415926)

print('{0}'.format(3.1415936))
print('{0:.3}'.format(3.1415936))  # .3表示一共是三位
print('{0:.3f}'.format(3.1415936))  # .3f表示是三位小数
# 同时设置宽度和精度:总宽度为10,小数点为3位
print('{0:10.3f}'.format(3.1415926))

6. String encoding conversion

  • Why do you need string encoding conversion

image-20230506162649866

  • How to encode and decode

    • Encoding: convert string to binary data (bytes)
    • Decoding: convert bytes type data into string type
  • code demo

The crawler part will apply

s = '天涯共此时'
# 编码
print(s.encode(encoding='GBK'))  # 在GBK格式中 一个中文占2个字节
print(s.encode(encoding='UTF-8'))  # UTF-8格式中,一个中文占3个字节

# 解码(解码格式 要和 编码格式 相同)
# byte代表一个二进制数据(字节类型数据)
byte = s.encode(encoding='GBK')
print(byte.decode(encoding='GBK'))
# print(byte.decode(encoding='UTF-8'))# 报错

Guess you like

Origin blog.csdn.net/polaris3012/article/details/130530850