Everything you need to know about population resequencing is here (1)

background introduction

        Molecular population genetics is a pillar subject of contemporary evolutionary biology research, as well as a basic theoretical subject of genetic breeding and genetic association mapping and linkage analysis. Molecular population genetics is developed on the basis of classical population genetics. It uses macromolecules, mainly DNA sequence variation patterns, to study the genetic structure of populations and the relationship between factors that cause population genetic changes and population genetic structures. Scientists can accurately infer the evolution of populations quantitatively, which not only overcomes the limitation that classical population genetics can only study short-term changes in population genetic structure, but also can test the reliability of previous inferences about long-term evolution or genetic system stability. . At the same time, the research on the variation patterns of molecular sequences in the population also makes people begin to re-examine Darwin's theory of evolution centered on "natural selection".

As shown below:

Figure 1 Explanation of gene allele frequency

                         

basic statistics

PIC polymorphic information content

PIC polymorphism information content, a measure of the amount of information a genetic marker polymorphism can provide in linkage analysis. It is the probability that one parent is heterozygous and the other parent is of a different genotype. It is now commonly used to measure the degree of locus polymorphism. Indicates SNP genetic properties in the population.

expected heterozygosity

Heterozygosity (He) is also called gene diversity, which refers to the possibility of a random individual containing any two different alleles at a polymorphic site, that is, the possibility of being a heterozygote, Hexp/ Hs represents the expected heterozygosity of the population.

average observed heterozygosity

Ho/Hi, the number of heterozygotes at the observed locus as a percentage of all detected loci

Fis

is the ratio of Ho relative to He reduction, that is, the average inbreeding coefficient of a population

Fis can be used as an indicator of the degree of inbreeding among individuals in a population

If Ho=He, then Fis=0, the population is random mating

If Ho<He, then Fis>0, inbreeding

If Ho>He, then Fis<0, outbreeding

Shannon Diversity Index

 Also known as Shannon-Wiener diversity index (Shannon's diversity index) is a commonly used index in ecology. It estimates the diversity index of allelic loci based on Claude Shannon's entropy formula

ρ is the genotype frequency

Ae

Effective number of alleles 

n represents the number of markers, H exp represents the expected heterozygosity of each marker

Nei's Diversity Index

Nei's-gene-diversity refers to the analysis of genetic diversity by calculating the genetic distance, that is, by calculating the haplotype diversity index to calculate the nucleotide sequence disproportionation distance between populations.

HWE

The Hardy-Weinberg equilibrium law ((Hardy-Weinberg equilibrium, HWE), also known as the law of genetics equilibrium, is the most important principle in population genetics. The test scale for maintaining balance. This law is named after two scholars, Hardy, GH (British mathematician) and Weinberg, W (German doctor), who discovered this law in the same year (1908). They proposed that in an infinitely large randomly matched population without mutation, migration and selection, the gene frequency and genotype frequency will remain unchanged from generation to generation. The Hardy-Weinberg equilibrium law can be divided into 3 parts: the first One part is the assumption: in an infinite population of random mating, there is no evolutionary pressure (mutation, migration, and natural selection); the second part is that the gene frequency will not change from generation to generation; the third part: after one generation of random mating, the genotype frequency will be To keep balance: p is the frequency of the genotype of AA, 2pq is the frequency of the Aa genotype, and q is the frequency of the aa genotype. Where p is the frequency of the A gene; q is the frequency of the a gene. The sum of the genotype frequencies should equal 1, that is, p²+2pq+q²=1.

        Among them, O represents the observed value, E represents the expected value, and the Hardy-Weinberg balance test conforms to the chi-square distribution with 2 degrees of freedom

for example

Input data format:

Output result format:

Interpretation of classic literature

       The research materials are 250 inbred lines from Pearl millet ( Pennisetum glaucum ) inbred germplasm resource bank (PMiGAP), and 250 materials are allocated to 4 early maturity groups (early, middle early, middle and late maturity groups) Drought stress treatment was carried out. Sixteen morphological, morphophysiological and agronomic traits were evaluated, including grain yield (GY), ear yield (PY), ear harvest index (PHI), flowering to 75% time (FT), etc.

       PMiGAP was genotyped using 37 SSR and CISP markers, while SNPs and InDels of 17 major validated drought tolerance (DT) QTL genes were used for genotyping and association analysis with species traits.

        The results showed that the average Nei's genetic diversity index of PMiGAP was 0.54. STRUCTURE analysis showed that ΔK showed the highest peak when K=6, and the 250 resource materials of PMiGAP were divided into 6 subgroups.

        PMiGAP was genotyped with 39 SNPs and 7 InDel markers in 17 genes. A total of 251 SNPs were identified in the 9487 bp sequencing region of 17 candidate genes, with an average of 1 SNP per 38 bp. There were significant correlations between 22 SNPs of 13 genes and 3 InDels under different treatments (P<0.05). Seven SNPs associations from five genes were common under either irrigation or drought stress treatments. Most notably, an important SNP in the acetyl-CoA carboxylase gene was constitutively associated with grain yield, grain harvest index, and ear yield under all treatments. InDel of chlorophyll a/b-binding protein gene was significantly correlated with green retention and grain yield traits under drought stress. This could serve as a functional marker for the selection of high-yielding genotypes with a "green-holding" phenotype under drought stress.

        This study identified marker-trait associations of important agronomic traits with validated major DT-QTL genes under irrigation and drought stress conditions, investigated the genetic diversity and structure in PMiGAP, and explored its potential in association analysis. Studies have shown that a high degree of genetic diversity and a moderate genetic structure are obtained in PMiGAP, which is very suitable for association analysis.

Figure 2 ΔK values ​​in model-based STRUCTURE analysis

Figure 3 Structure analysis of PMiGAP population structure based on 37 SSR and CISP markers (K=6)

Table 1 Average number of observed alleles (A), effective number of alleles (Ae), Shannon index (I), observed and expected heterozygosity (Ho and He) and private alleles (Private alleles) of the six subpopulations )

Fig. 4 Grain yield and leaf rolling and green retention under drought stress

(a) Contour plots of grain yield versus leaf curl and flowering time (b) Effects of green retention (c) and leaf curl (d) on grain yield under drought stress

Figure 5 Intragenic linkage disequilibrium of candidate genes

       Squared correlation coefficient (r2) values ​​are represented in the upper triangle on a color scale from white (0.0) to red (1.0). P-values ​​ranging from insignificant (0.01; white) to highly significant (<0.0001; red) are shown in the triangles below. (1. Uridylic acid kinase 2. Acyl-CoA oxidase 3. Zinc finger CCCH type 4. Ubiquitin coupling enzyme 5. Actin depolymerization factor 6. Phytochrome 7. Dipeptidyl peptidase IV8 8. Serine Carboxypeptidase 9. Serine/Threonine Protein Kinase 10. Phosphoglycerate Kinase 11. Chl a/b Binding Protein 12. Catalase 13. Alanine Glyoxylate Transaminase and 14. Photolyase)

Guess you like

Origin blog.csdn.net/SHANGHAILINGEN/article/details/124271771#comments_21673616