文本数据的匹配

例如:
有两个文本文件,解决文本a中的蛋白质是否存在文本b中:
文本a:


6570989-4c655d98c7d70d60.PNG
文本a

文本b:


6570989-ddf55268883b25cb.PNG
文本b

python脚本

# proteins participating in cell cycle
list_a = []
for line in open("cell_cycle_protein.txt"):
    list_a.append(line.strip())
print(list_a)

# proteins expressed in a given cancer cell
list_b = []
for line in open("cancer_cycle_proteins.txt"):
    list_b.append(line.strip())
print(list_b)
for protein in list_a:
    if protein in list_b:
        print('detected')
    else:
        print('not observed')

输出结果:

F:\文件处理\venv\Scripts\python.exe F:/文件处理/Uniprot_allign.py
['Uniprot ID', 'p62258', 'p61981', 'p92191', 'p17924', 'p45353', 'p35998', 'p62333', 'p99460', 'o75232']
['Uniprot ID', 'p62258', 'p61981', 'p92191', 'p17980', 'p43686', 'p35998', 'p62333', 'p99460', 'o75832']
detected
detected
detected
detected
not observed
not observed
detected
detected
detected
not observed

Process finished with exit code 0

猜你喜欢

转载自blog.csdn.net/weixin_34277853/article/details/86783170