reads count文件转化为fasta格式文件(uniq reads)

在NCBI下载测序数据时有很多是以reads序列 + count数的格式,这种是作者去完接头并过滤掉低质量reads后的结果。下面实现将reads count格式转化为fasta格式

cat reads_count.txt
AAACCCGGGTTT 3
ACAAGATTAG 5
TAGACAGA 1

python实现

fw = open('./reads.fas', 'w')
s = 0
with open('./reads_count.txt', 'r') as fr:
    for line in fr.readlines():
        s += 1
        name = '>ID' +s + '_' + line.strip().split('\t')[1]
        seq = line.strip().split('\t')[0]
        fw.write(name + '\n' + seq + '\n')
fw.close()

linux实现

awk

猜你喜欢

转载自blog.csdn.net/weixin_40099163/article/details/83413321