PEAKS|NovoHMM|Nover|DeepNovo|MAYUPercolator|UniprotKB|Swiss-prot|Mascot|SEQUEST|X!Tandem|pFind|MaxQuant|Msconvert|PEPMASS|LC|

 

MS:

Mass spectrometry was first charged particles ionized molecules, according to nuclear ratio mass separation, the mass spectrometer to identify the electrical signal obtained from the mass spectrum.

 

 

 

Top-down directly result is a protein.

 

 

 

Bottom down using shutgun method result is obtained peptides.

 

 

 

Interrupted by the protein mixture is a mixture of a peptide segment, is separated into specific time the LC ,

 

 

 

Initial spectra obtained as a spectrum, a spectrum is tandem mass spectrometry, a peptide wherein a peak. Selecting a peak of the spectrum as a starting material two, selected parent ions (parent ion is the whole peptide PEPMASS ) and break into the mass spectrometer, the spectrum is a two peptide segments of a drawing. In addition you can also do more map levels, Tandem Mass Spectrometry ), two spectra for the qualitative core. Using proteases produce different contact point on the nature of the physical and chemical conditions in different digested, typically with one enzyme, sometimes to than in two. Wherein, the relationship between the ions and the peptide: peptide ions configuration, i.e. the ions substructure "carboxy ---- ion --- amino" type compositions. When AA after one fall away, measured b + y paired data, i.e. Peptide the fragment .

 

Mass spectrometry analysis and de novo methods matching database search, a database commonly used method, i.e., a peptide known experimental spectra theoretical pattern segment database to find matches.

 

 

Data processing flow:

 

Because different mass spectrometers output in different formats, so the data format conversion. Msconvert for pretreatment for mass conversion, noise can be reduced, and the output correction parent ion normalized data.

 

 

 

 

You can select the following search engines, Mascot good identification effect; SEQUEST is a traditional tool, easy to learn and upgrade but free open source; ! The X-Tandem can demand changes; pFind made by a small team, but good sensitivity; MaxQuant simultaneous qualitative output and quantitative data, easy to use, but low coverage. Different software have different results, it is best to choose a specific post.

 

You can select the database, Uniprot in UniprotKB & Swiss-prot high-quality, low redundancy and artificial experts confirmed; neXtprot stores information about the human protein data, and there are associated with experimental evidence; IPI 's relatively poor stability; NCBI in the RefSeq & nr is common database and added directly, so the database redundancy large, very noisy.

 

Search engine parameter settings

 

Protein identification and quality control:

Based database method:

Method map database based on the quality of a false positive focuses the FDR Rate Discover to false , as to gradually enlarge from the spectrum of peptides to protein error. As shown, a total map errors in a protein error. So in the spectrum phase should be quality control. Continuous raw data y consecutive b is good.

 

 

 

First protein sequence of the peptide sequence cut into sections, then sorted by peptide molecule, is determined as a length of the standard peptides can be identified, and identification of the peptide may be retained segment length. According to these peptides in the database to find the candidate pattern, based on the composition of the theoretical spectrum scoring function ( Xcorr & CN score determines a first gap and a second gap pattern) of credibility.

Can be positive or pseudo-sequence database based on the use Percolator do quality control, because the theory is a collection of maps the correct sequence and error sequences, and the library is a collection of pseudo-error sequences, which are candidates for admission complement sequence. So do the subtraction results in a linear curve model or models map into classification problems in order to find the part of candidates for admission.

 

 

 

In this case the presence of the deduced protein difficult shared peptides, proteins may be used a simple assembly method. Simple assembly of most proteins are selected proteomic information carried happened protein combination. Common MAYU law.

 

 

 

Based on de novo prediction: 

De novo sequencing prediction method is that each AA is considered again, this method can produce new proteins, there PEAKS , NovoHMM , commonly Nover method, because of its speed, DeepNovo depth learning.

Guess you like

Origin www.cnblogs.com/yuanjingnan/p/11616189.html