samtools faidx

$ samtools faidx t1.fa && echo "faidx built"

$ cat t1.fa.fai
scaffold332     2588    13      100     101
scaffold322     8291    2640    100     101
scaffold342     24194   11027   100     101
scaffold191     43246   35476   100     101
scaffold1157    21100   79169   100     101

$ samtools faidx t1.fa scaffold332 > scaffold332.fa

$ cat scaffold332.fa |head -4
>scaffold332
TTCTGTGAGATCTCTCTGAAAAATAATTGAGAAATCAAGATATTTCAAGCTTTCAGTAAA
AAGGTGAGGCGGAGAATGGAAAAGTGAAAAATTCAGAAGGAACTTGTTCCTAGATTACAG
AGCAGTTTTAAAAATGAGGTAGACATCGGATAAGAAAACAGACCTCAGAAATGCCTAGGA

 $ cat scaffold332.fa |tail -4
 CATTTGAGAGTAATTTCTAATACATGCAAGCCTTTGAACAGATGCTACATAAGACAGTCA
 GAAGCAATTTCTTAAAAAAAATAAAACAAGCACCCCCCAAACCCCAAAGCACCCACTGAG
 ACCTCAGTACGGCACAATGCTTAAGCATCTGCTCGAGCTTAGTTTCAGTACTTGTTAGGT
CACACTGA

 

The first column NAME: the name of the sequence, leaving only the ">", the contents before the first blank;

The second column LENGTH: length of the sequence, in units of BP;

Third column OFFSET: a first base offset, starts counting from 0, line breaks for statistics; value gff file mRNA start of the row

The fourth column LINEBASES: except for the last line, the other row represent the sequence number of bases, BP units;

Fifth column LINEWIDTH: line width, except for the last line, the line length sequences of other representatives, including line breaks, line breaks in the windows system is \ r \ n, 2 to be added on the basis of sequence length;

Guess you like

Origin www.cnblogs.com/yuanjingnan/p/11230665.html