python关于字符串内置方法

1. str.split()

Return a list of the words in the string, using sep as the delimiter string.
sep
The delimiter according which to split the string.
None (the default value) means split according to any whitespace,
and discard empty strings from the result.
maxsplit
Maximum number of splits to do.
-1 (the default value) means no limit.

1.)str.split(sep=None, maxsplit=-1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).
2.)If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, ‘1,2’.split(’,’) returns [‘1’, ‘’, ‘2’]). The sep argument may consist of multiple characters (for example, ‘1<>2<>3’.split(’<>’) returns [‘1’, ‘2’, ‘3’]). Splitting an empty string with a specified separator returns [’’].
3.)If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

split()方法对字符串进行分割并返回一个列表,分割符为sep,maxsplit参数默认为-1,进行不限次数最大分割,如果该参数给定,那就分割maxsplit次.
而sep则分为两种情况:
(1)sep未给定:
a.未指定按空格分割,多个连续空格算一个,即会删掉空字符;
b.sep未指定时,换行符,制表符等会形成空白的转义字符也会被当空格处理,不会被分割
c.分割一个空字符返回的是空列表
(2)sep已指定:
a.不能指定为空字符;
b.如果字符串中出现多个连续的分割符,则分别计算,视为分割’空’字符,即多出几个连续的分割符,就分割成几个空字符;
c.分割字符可以是多个;
d.分割一个空字符返回一个列表包括一个空字符[’’]
下为示例,使用时一定要做区分:

#--------三个空格-------------------两个空格
a='it is a   big company is \n \a \r  "" '
b=''
print(a.split(maxsplit=5))
print(a.split())
print(a.split(' '))  # 一个空格
print(a.split('is'))
print(b.split())
print(b.split('is'))
print(len(b.split('is')))
#result
['it', 'is', 'a', 'big', 'company', 'is \n \x07 \r  "" ']
['it', 'is', 'a', 'big', 'company', 'is', '\x07', '""']
['company']
['it', 'is', 'a', '', '', 'big', 'company', 'is', '\n', '\x07', '\r', '', '""', '']
['it ', ' a   big company ', ' \n \x07 \r  "" ']
[]
['']
1

2. str.splitlines()

str.splitlines([keepends])
Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.
This method splits on the following line boundaries. In particular, the boundaries are a superset of universal newlines.

Representation	Description
\n	Line Feed
\r	Carriage Return
\r\n	Carriage Return + Line Feed
\v or \x0b	Line Tabulation
\f or \x0c	Form Feed
\x1c	File Separator
\x1d	Group Separator
\x1e	Record Separator
\x85	Next Line (C1 Control Code)
\u2028	Line Separator
\u2029	Paragraph Separator

Unlike split() when a delimiter string sep is given, this method returns an empty list for the empty string, and a terminal line break does not result in an extra line.

splitlines()方法按行分割符分割分割字符串并返回列表,分割后不显示分割符,如果keepends设为Ture,则显示分割符;
在字符串末尾的换行符不会再产生空行(即列表末尾不会产生空字符),另外对空字符串使用此方法返回空列表,这两点区别于split
下例:

a='\nit is a\r\n   big\n\n company is \n'
b='   \n '
c=''
print(a.splitlines())
print(a.splitlines(keepends=True))
print(b.splitlines())
print(c.splitlines())
print(a.split('\n'))
print(b.split('\n'))
print(c.split())
#result
['', 'it is a', '   big', '', ' company is ']
['\n', 'it is a\r\n', '   big\n', '\n', ' company is \n']
['   ', ' ']
[]
['', 'it is a\r', '   big', '', ' company is ', '']
['   ', ' ']
[]

3. str.strip()

str.strip([chars])
1)Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped
2)The outermost leading and trailing chars argument values are stripped from the string. Characters are removed from the leading end until reaching a string character that is not contained in the set of characters in chars. A similar action takes place on the trailing end.

strip()方法去除字符串两端的指定字符,返回一个处理后的字符串副本,使用尤其需要注意两点:
1.参数chars可以是多个字符,处理时从两端开始一个一个匹配,只要字符是chars的子串,就会被去掉,如果不匹配则立即停止匹配.
2.不带参数则默认去除两端空白(不是空格,\n,\t等都会去掉),而带参数则只去除指定字符,参考split方法不带参数时的分割方式,有一些相似性.
下例:

>>>'www.example.com'.strip('cmowz.')
'example'
>>> a='   \nit is a\r\n   big\n\n company is \n   '
>>> a.strip()
'it is a\r\n   big\n\n company is'
>>> a.strip(' ') # 去除空格
'\nit is a\r\n   big\n\n company is \n'
>>> a.strip('\n is')
't is a\r\n   big\n\n company'