Python counts the number of characters in a string

1. Post the title

The topic comes from the MOOC
"Playing Data with Python" (Nanjing University),
the third week of programming assignments


Define the function countchar() to count the number of all occurrences of letters in the string in alphabetical order (uppercase characters are allowed, and the count is not case-sensitive). Shaped like:

def countchar(str):
      ... ...
     return a list
if __name__ == "__main__":
     str = input()
     ... ...
     print(countchar(str))

input format:
string

output format:
list

Input sample:
Hello, World!

Sample output:
[0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 3, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 1 , 0, 0, 0]
Time limit: 500ms Memory limit: 32000kb


2. Description

This question really took some effort.
The idea is to use the letter as the key and the number of occurrences as the value to create a dictionary, and then call the function to output the value according to the input string. (There is a pit here, explained later).

3. Reference code

def countchar(st): #定义数个数的函数
    keys = [chr(i+97) for i in range(26)] #生成26个字母的key列表
    di = {}.fromkeys(keys,0) #赋给每个key初始值0
    new = [] #建立一个新列表用于存放有序的key
    st = st.lower() #将所有输入的字符改为小写
    for s in st: #遍历字符串
        di[s] = st.count(s) #输出每个字符的个数,存放到字典里
    for k in di: #遍历字典
        if  k in keys: #如果key出现在26个字母中,则将其值添加到新列表,获得有序的26个字母的个数
            new.append(di[k])
    return new #返回存有26个字母个数的列表
if __name__ == "__main__":
    st = input() #输入字符串
    str1 = "" #定义一个空字符串
    for s in st: #遍历输入的字符串
        if s.isalpha() != 0: #只有字母才添加到新字符串,标点忽略不计
            str1 += s
    print(countchar(str1)) #输出列表

4. Notes on the code

  • The code to generate 26 letters did not think of how to write it at first, and it was written directly by hand

    keys=["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
    

    Then got scolded. .

  • When naming dict, dict was used directly, and python's reserved words were used. Although there was no error, it was still ruthlessly complained. .

  • When assigning the initial value of each element, I don't remember how to write it. I checked the data and found the value of multiple keys set through fromkeys
  • Consolidated str.lower() for changing all characters to lowercase
  • The count function str.count that actually uses the string (character or string to be counted)
  • The order of the keys of the dictionary in the dictionary is out of order, because I didn’t notice this, the code at the beginning was wrong, because the output list was in the wrong order, and it was corrected after consulting, adding traversal and then adding to the new code for list
  • Ignoring punctuation, you can create a string containing various special characters and then use the remove function. If you only need to count letters (or only numbers), you can use the special isalpha or isdigit functions. Related information Python isalpha() method
  • In addition, if you are complained about using the count function and then traversing the loop, the execution of the program will be very slow. .

5. Better code

Refer to the following blogs

Modify the code as follows

def countchar(str):  
    alist = []  
    for i in range(26):     #初始化一个长度为26,值均为0的列表  
        alist.append(0)  
    str = str.lower()  
    for i in str:  #遍历字符串,如果是字母,则利用ascii码值转换为数字再平移至0-26范围内,将对应的列表值加一
        if i.isalpha():      
            alist[ord(i)-97] += 1  
    return alist  

if __name__ == "__main__": #主函数部分与原代码相同
    st = input()
    str1 = ""
    for s in st:
        if s.isalpha() != 0:
            str1 += s
    print(countchar(str1))

6. Code execution speed

The string to be counted is

st = "".join([chr(i+96)*i*100 for i in range(1,27)])

When, the execution speed results of the code of 3 and the code of 5 are as follows

3 codes

[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600]
time: 1.108361 s

5 code

[100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600]
time: 0.023698 s

Indeed, the speed difference is more than 50 times. .

Attached 3 code test code

import time
def countchar(st):
    keys = [chr(i+97) for i in range(26)]
    di = {}.fromkeys(keys,0)
    new = []
    st = st.lower()
    for s in st:
        di[s] = st.count(s)
    for k in di:
        if  k in keys:
            new.append(di[k])
    return new
if __name__ == "__main__":
    st = "".join([chr(i+96)*i*100 for i in range(1,27)])
    str1 = ""
    start = time.clock()
    for s in st:
        if s.isalpha() != 0:
            str1 += s
    print(countchar(str1))
    end = time.clock()
    print("time: %f s" % (end - start))

7. Last Archive Error Code

The code below is incorrect! ! !
The reason for the error (without taking into account that the internal storage of the dictionary is unordered)

def countchar(st):
    keys = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
    dict = {}.fromkeys(keys,0)
    st = st.lower()
    for s in st:
        dict[s] = st.count(s)
    return list(dict.values())
if __name__ == "__main__":
    st = input()
    str1 = ""
    for s in st:
        if s.isalpha() != 0:
            str1 += s
    print(countchar(str1))

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324678149&siteId=291194637