R language biome (ecological) data statistical analysis and drawing (from data sorting to analysis results display)

The open source, free, and free features of the R language make it widely used in the statistical analysis of biome data. Biome data are diverse and complex, involving numerous statistical analysis methods. Taking the most commonly used statistical methods in the analysis of biome data, such as regression and mixed effect models, multivariate statistical analysis techniques, and structural equations, as the main line, through a number of examples from classic studies, the R language implementation of each method is described in detail Ways (see teaching content for details). Focusing on the field of ecological research, from the basic operation and drawing of R language, data preparation and sorting, to the application scenario analysis of various quantitative analysis methods, to realize the complete scientific research data analysis process from data sorting to analysis results display, the "R language Basics", "tidyverse Data Cleaning", "Multivariate Statistical Analysis", "Random Forest Model", "Regression and Mixed Effects Model", "Structural Equation Modeling" and "Graphing Statistical Results" were combined (7 in 1).

Topic 1: Introduction to R and Rstudio and basics of getting started and drawing

1) R and Rstudio introduction: background, software and package installation, basic settings, etc. a 2) R language basic operations, including vector, matrix, data frame and data list generation and data extraction, etc. a 3) R language data file reading Acquisition, sorting and storage, etc.a 4) R language basic drawing (including ggplot): basic drawing, typesetting, publication quality drawing output storagea

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 2: R language data cleaning - tidyverse package application

1) tidyvese introduction: tidyr、dplyr、readr、%>% etc

2) File operation: read files in different formats, read multiple files at the same time, etc.

3) Data filtering: row filtering, column filtering, conditional filtering (character manipulation), etc.

4) Data generation: data merging, data splitting, new data generation (character manipulation), etc.

5) Length and width data conversion, filling and deletion of null values ​​(NA), grouping, sorting and summarization, etc.

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 3: Community data preparation and exploration analysis

1) Biome data preparation: species composition, environmental variables, species functional attributes, phylogenetic trees, etc.

2) Biome data inspection: missing values ​​and outliers (outliers), etc. - to avoid model errors (GIGO)

3) Species diversity calculation: species diversity (TD), functional diversity (FD) and phylogenetic diversity (PD)

4) Introduction of species similarity/dissimilarity matrix association measure

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 4: Unconstrained sorting of community data - PCA, CA, PCoA, NMDS

1) Introduction to Unconstrained Sorting Analysis of Biome Data

2) Case 1 fish habitat data ranking: PCA

3) Ranking of bird species composition data in case 2: comparison of CA, PCoA and NMDS

Add picture annotations, no more than 140 words (optional)

Topic 5: Community Data Constraint Sorting - RDA, dbRDA, CCA, 4th Corner

1) Introduction to Constrained Sorting of Biome Data: Asymmetric Constrained Sorting VS Symmetric Constrained Sorting

2) Interpretation of moth community distribution of landscape, patch and habitat factors in Case 1: RDA, dbRDA or CCA selection + variation decomposition

3) Case 2 species with or without (0, 1) data constraint sorting: dbRDA

4) Correlation Analysis of Species Composition, Species Attributes and Environmental Factors in Case 3 - 4th Corner Analysis (4th Corner)

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 6: Group analysis of community data: hierarchical/non-hierarchical clustering (HC/NHC), PERMANOVA, etc.

1) Overview of clustering and difference analysis of biome data

2) Hierarchical and non-hierarchical clustering of bird habitat data in Case 1: KMEANS and HCLUST

3) Case 2 Test of differences in suitable habitat for turtles (comparison of two groups) and interpretation: PERMANOVA, MRPP, ANOSIM and Dispersion test

4) Analysis of microbial composition differences under environmental gradients in case 3 (comparison of multiple groups) and interpretation: MRPP and Dispersion Test

5) Case 4 Effects of Drugs on Gut Microflora: PCoA+PERMANOVA

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 7: Community Data Random Forest (Random Forest) Model - Classification VS Regression

1) Introduction to Random Forest Model

2) Basic process of random forest model analysis - classification VS regression

3) Case 1 Random forest classification and important variable selection: RFM-classification

4) Case 2 Random forest regression model and variable importance assessment: RFM-regression

5) Case 3 The relationship between multidimensional morphological attributes and ecological attributes of species: a comprehensive case of PCA+PCoA+LDA+RFM

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 8: General Linear Model (lm)

1) Basic form, basic assumptions, estimation methods, parameter testing, model testing

2) Regression, variance and covariance analysis of different fish swimming speeds in Case 1

3) Case 2 Determinants of marine herbivorous fish diversity - model validation

4) Screening of Environmental Factors for Freshwater Fish Abundance in Case 3 - Stepwise Regression (model selection)

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 9: Generalized Linear Model (glm)

1) Basic principles, modeling steps and process of generalized linear mixed effects model

2) Logistic model with or without (0, 1) data in case 1 - binomial distribution

3) Case 2 Relationship between seal age and aggressive behavior - 0, 1 data transformed into ratio data analysis

4) Case 3 Environmental interpretation of species abundance distribution - count data Poisson, negative binomial, zero inflation, zero truncation model

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 10: Linear Mixed Effects Model (lmm)

1) The basic principle of mixed effect and the basic process, steps and realization of analysis

2) Case 1 Hierarchical data determinants of species diversity - model construction process, model prediction and diagnosis

3) Case 2: Multiple comparisons in a multivariate experiment (stratified data)

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 11: Generalized Linear Mixed Effects Model (glmm)

1) Basic principles, modeling steps and process of generalized linear mixed effects model

2) Multivariate analysis-logistic mixed effect model of case 1 tadpole "metamorphosis" or not (0, 1)

3) Multivariate analysis of factors affecting the abundance of insect-eating seeds in case 2 - Poisson mixed effect model

4) Generalized linear mixed effects model analysis of count data and model selection: Poisson, pseudo-Poisson, negative binomial, zero-inflated Poisson, zero-inflated negative binomial, zero-truncated Poisson and zero-truncated negative binomial models

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 12: Spatial, temporal and phylogenetic correlation regression - data autocorrelation (autocorrelation) analysis

1) Introduction to data autocorrelation issues: Introduction to time, space and phylogeny

2) Spatial autocorrelation correction of forest plant diversity distribution pattern in Case 1

3) Time autocorrelation correction of bird abundance in different years in Case 2

4) Case 3: The role of phylogenetic correlation in analysis of shrimp abundance distribution

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic Thirteen: Structural Equation Modeling (SEM): lavaan and piecewiseSEM-multivariate direct and indirect effects and causality

1) Introduction to structural equation modeling: definition, history, application, estimation method, model identifiable rules and sample size requirements, etc.

2) The direct and indirect effects of community species richness restoration in case 1: the basic process of SEM analysis-lavaan vs piecwiseSEM

3) Case 2 Effects of environmental heterogeneity and resource availability on understory vascular plant diversity at different successional stages: model adjustment, comparison, evaluation and presentation of results

4) Case 3 Relative contributions of human activities, environmental conditions, and species attributes to the size of animal domains (relative roles): mixed model, nested structure, group analysis, and SEM implementation of categorical variables

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Topic 14: Community data and statistical analysis results plotting (ggplot), typesetting and publication quality map output

1) Community data and statistical analysis results mapping data preparation: result extraction and mapping data arrangement

2) Cluster analysis and group difference test chart: cluster result map, heat map (heatmap), group difference test result map

3) Unconstrained ranking diagrams such as PCA, CA, PCoA and NMDS: ranking diagram and biplot

4) RDA, db-RDA and CCA and other constraint sorting diagrams: triple sequence diagram (triplot) and Venn diagram (venn)

5) Regression and mixed effect model analysis results graphs: scatter plots, box plots, histograms and violin plots, etc.

6) Structural Equation Model Result Diagram Expression

Edit toggle to center

Add picture annotations, no more than 140 words (optional)

Recommended reading:

R language regression and mixed effect (multilevel/hierarchical/nested) model and Bayesian implementation

Practical Application of R Language Structural Equation Modeling (SEM) in the Field of Ecology

Guess you like

Origin blog.csdn.net/weixin_58566962/article/details/130934039