STRING Cytoscape interpenetrating network mapping

FIG Network (the Network) seem complicated, in fact, constitutes a very simple network diagram illustrating a model, shaped like a network, so that network diagram, by two factors node (node) and the connection (Edge) thereof. Wherein the node is divided into (the destination node) two factors Source node (source node) and a target node thereof. node here is our genes, edge relationship is the interaction between genes. Any network are nothing more than the constituent components of FIG. Known the composition of network diagram after diagram analysis is very simple to do.

Node (node)

The so-called nodes, that is, we have to analyze the gene. In a network diagram which often have dozens or even hundreds of nodes, it means that we need to analyze the genes of dozens or hundreds. How these genes come from? This is associated with our research purposes, and that these genes may be screened out of our gene expression differences may be some genetic mutations in cancer patients in high frequency, it could be one miRNA of downstream target genes and so on.

When performing network diagram analysis, we tend to gene sources not required, as long as you think it makes sense to group genes on it. However, the number of genes, we tend to have certain restrictions. Because the number of genes a little, and too little edge network diagram, map can not do this, or do it ugly; but more than the number of genes, network diagrams too much, resulting in no way to import software for analysis, time-consuming too long , while the background noise and confounding effects will be more. Thus, the number of network FIG genes usually analyzed in 50-- about 300, such networks FIG moderate, not too large nor too small.

Connection (edge)

edge is the interaction between genes. For example, if there is interaction between CXCL12 and it between two TP53 gene? Judge it by what method? This is a more difficult problem. Fortunately, there are some very good database to help us solve this problem, such as the most famous is STRING database.

STRING database

STRING (https://string-db.org) is a very comprehensive protein interaction network database, which stores the interaction between so many species and genes. We just put up the gene names submitted, it can be determined when the interactions between them have a relationship.

STRING database search system is the interaction between a known protein and the predicted proteins. This includes both the direct physical interaction between the protein interactions, but also an indirect correlation between protein function, is the most comprehensive, the most authoritative database of protein interactions.

STRING database contains experimental data, the results from this Pubmed Abstract Chinese mining, comprehensive data from other databases, in addition to the use of bioinformatics methods to predict the results of the bioinformatics applications are: chromosome near the gene fusion, phylogenetic spectrum, based on the co-expression microarray data and the like.

Cytoscape

Cytoscape is a complete network diagram analysis system, it is not just a software, also includes a series of programming language interfaces, app store, and many other content, is a leader in the field of network analysis. Cytoscape can help us visualize gene interaction network map, and help us find key genes and there's a lot of analysis through its plug-ins.

Research ideas

step1 from gene to protein interaction listing

step2 from protein interaction network interaction done

step3 interaction network from genes critical to

Specific steps

step1 prepare a list of genes

This gene list of files that white is a gene, the gene for the best number 50 - 300.

step2 Open STRING database

Click SEARCH, then jumps to let us enter the gene list page, as shown below, we click on "Multiple proteins", and then turn our gene list and enter the name of the species, you can click SEARCH.

 

Then STRING protein database searches we have submitted, you can click CONTINUE.

After the interaction network map of these genes will appear. The network map has a lot of color dots, the color is not a random distribution of biological significance, there are some points in the colorful three-dimensional structure of the protein, this is not very important to us, it is important proteins inter-connection, which is interaction.

The following chart has a lot of panel, there contains a lot of features, the most important is to Exports, from here we want to output graphics and networking.

For the primary analysis, the network map on it; if it is then advanced analysis and beautiful map of the network, such as the need to find the key genes, need to publish quality of high-level network diagram, it would need the source file, the source file is a tsv file through it, you can make a variety of network diagram.

 

 to be continued...

 

Guess you like

Origin www.cnblogs.com/0820LL/p/11417065.html