Requirements for this job see https://edu.cnblogs.com/campus/nenu/2019fall/homework/7628
Requirements 0 to war and peace as the input file, and re-read by the file system to read into. Three consecutive runs, given time, CPU consumption of each parameter. (2 minutes)
Method of operation
ptime wf -s < war_and_peace.txt
Continuous run time of three shots
First time consuming: 1.574s
CPU parameters: Intel (R) Core (TM) i5-8300H CPU @ 2.30GHz 2.30 GHz
The second elapsed time: 2.164s
CPU parameters: Intel (R) Core (TM) i5-8300H CPU @ 2.30GHz 2.30 GHz
Third time consuming 1.583s
CPU parameters: Intel (R) Core (TM) i5-8300H CPU @ 2.30GHz 2.30 GHz
Requirements 1 shows the bottleneck in your program's guess. Do you think there will be optimized for best results, or last week's optimized here (or take into account optimization, and therefore worse code not written).
Guess bottlenecks: 1. the redirection file read in through the document and convert the uppercase letters to lowercase takes too long.
2. Use regular expressions to distinguish between words and word frequency statistics took too long.
Requirement 2 bottlenecks profile to find out the program. We are given the most time running three functions (or code fragment). Requirements include a screenshot.
Use the command line, enter the following command into the program directory:python
-
m cProfile
-
s time wf.py
-
s < war_and_peace.txt
Analysis results screenshot:
The most time-consuming of the three functions findall () 0.297s, Couter () 0.134s, read () 0.086s
Code optimization ago:
def doCountByPurText(inputText): words = re.findall(r'[a-z0-9^-]+', inputText.lower()) collect = collections.Counter(words) num = 0 for i in collect: num += 1 print('total %d words\n' % num) result = collect.most_common(10) for j in result: print('%-8s%5d' % (j[0], j[1]))
After optimizing the code:
def doCountByPurText(inputText): words = re.findall(r'[a-z0-9^-]+', inputText.lower()) count(words)
git link: https: //e.coding.net/xulijun/xiaonengfenxi.git