Pysam available to handle bam file
installation:
You can use pip or conda
use:
Pysam have a lot of functions, the main reading functions are:
-
AlignmentFile: reading BAM / CRAM / SAM file
-
VariantFile: reading data variability (VCF or BCF)
-
TabixFile: tabix read the file from the index;
-
FastaFile: reading fasta sequence file;
-
FastqFile: fastq sequencing reads the sequence file
Commonly used in the first and second.
example:
import pysam
bf = pysam.AlignmentFile ( "in.bam", "rb"); wherein r = read, b:. binary binaries. bam file index
bf is an iterator can next () or for reading
for i in bf:
print i.reference_name,i.pos,i.mapq,i.isize
result:
ctg000331_np121 144935 27 -284
ctg000331_np121 144940 48 291
ctg000331_np121 144941 48 309
ctg000331_np121 144944 48 255
ctg000331_np121 144946 27 -370
ctg000331_np121 144947 27 -346
-
Representative i.reference_name read than the reference sequence to the chromosome of the id;
-
Representative i.pos read position of alignment;
-
Representative i.mapq read than the quality value;
-
Representative PE read i.isize direct insertion fragment length, sometimes referred to as Fragment length;
Some features:
- check_index()
Detecting whether the index file exists is the true presence
- close()
Run out remember to close
- count(self,contig=None, start=None, stop=None, region=None, until_eof=False, read_callback='nofilter', reference=None,end=None)
bf.count(contig="ctg000331_np121", start=1, stop=6000)
24
- count_coverage(self, contig=None, start=None, stop=None, region=None, quality_threshold=15, read_callback='all', reference=None, end=None)
bf.count_coverage(contig="ctg000331_np121",start=1,stop=100)
- fetch(self, contig=None, start=None, stop=None, region=None, tid=None, until_eof=False, multiple_iterators=False, reference=None, end=None)
- get_index_statistics (self)
by the number of statistical reads the index file BAM on each chromosome mapped / unmapped of.