Review Questions Text Processing

Foreword: The
exam is about to come, and the teacher has released review questions.
The question is coming, and there are already answers on the review questions. When reviewing, you need to cover the answers with your hands to check if you can.
Can we use the programming knowledge we have learned to remove the correct options in the brackets of these questions? After several hours of non-stop adjustments, I tried many methods and finally succeeded!

Ideas for the first failure:
1. Convert the imported text into a list
. 2. Use a for loop to perform regular matching on each element.
3. Re-assign the subscript of each matched element in the list
4. Write to the file

Due to unknown reasons, the original text has not been changed.
Then re-plan and adjust the thinking.

1. Import text

The first step to analyze is how to import text.
Because the file format I want to deal with this time is docx, which is a word document.
We only need to change the file suffix name to txt, but one thing to pay attention to is that it must be saved as a storage with encoding=utf-8, otherwise it will cause format problems and cannot import this article. Here, the mobile phone is different from the computer. The mobile phone directly changes the suffix of the word document to txt, and its file encoding format is gbk, but if it is a computer, garbled characters will occur, so be careful when changing it on the computer.

When writing the code, you must also pay attention to the encoding of the file. What kind of encoding the file is, and the same encoding should be written when importing it.

with open('/storage/emulated/0/AlivcData/log/text.txt',encoding='gbk')as file:
    data=file.read()

Second, use regular expressions
The idea of ​​step 2 is to exclude ABCDEF at the beginning of the option, because what we want to remove is the abcdef in the parentheses. When
writing code, first convert the read text into a list and divide it by line breaks.

items=data.split('\n')
string=[]
for i in items:    
    r=re.search('^[A-E]',i)    
    if r !=None:    	
    pass    	
    #print(r.group(0))

Third, use the replace() function to delete the positive selection

Since the original string does not change after the replace function is operated, we need to create a new variable, and because our function is used in the for loop, we define an empty list outside the loop to receive our replacement The value after the function operation.

string.append(i.replace('A','').replace('B','').replace('C','').replace('D','').replace('E','').replace('F','').replace(',','').replace('V','').replace('x','').replace('×','').replace('√',''))

Full code:

import re
with open('/storage/emulated/0/AlivcData/log/text.txt',encoding='gbk')as file:	data=file.read()
items=data.split('\n')
string=[]
for i in items:    
    r=re.search('^[A-E]',i)    
    if r !=None:    	
    pass    	
    #print(r.group(0))    
    else:
        string.append(i.replace('A','').replace('B','').replace('C','').replace('D','').replace('E','').replace('F','').replace(',','').replace('V','').replace('x','').replace('×','').replace('√',''))
for i in string:	
    print(i)

Summary: Because it is the code written on the mobile phone, it took a long time to test the ability of character processing, logical thinking, and basic grammar knowledge.

The effect of removing answers for multiple-choice questions: The effect of
Insert picture description here
removing answers for true or false questions:
Insert picture description here

Guess you like

Origin blog.csdn.net/qq_17802895/article/details/111315631