多性状分析中Factor Analysis Model的用法及相对于多性状分析的优势

因子分析模型, Factor Analysis models
注意, 这里的因子分析, 不是多元分析中的因子分析(与主成分分析类似), 这个因子分析主要出现在遗传评估领域, 比如植物, 林木中多地点数据的分析, 动植物中多性状分析计算遗传相关时.

优势
由于它可以减少方差组分估算的个数, 特别是在多性状时具有优势, 因此在多性状分析时, 如果模型不收敛, 可以尝试一下因子分析(FA model), 最主要的是它可以经过一些转化, 变为多性状模型的G矩阵结构, 这是非常友好和有用处的. 个人理解, 其类似随机回归模型的勒让德多项式, 但是操作更简单.

介绍
FAk, FACVk and XFAk are different parameterizations of the factor analytic model in which S is modelled as S= LL’ + P where L is a matrix of k loadings on the covariance scale and P is a diagonal vector of specific variances. See Smith et al. (2001) and Thompson et al. (2003) for examples of factor analytic models in multi-environment trials.

FA模型 VS 多性状模型
FA模型会生成(Gamma)特殊矩阵, 和loading(Psi), 他们与G矩阵的关系是:
G = Γ Γ + Ψ G = \Gamma*\Gamma' + \Psi

S is modelled as S= LL’ + P where L is a matrix of k loadings on the covariance scale and P is diagonal. The parameters in FACV are specified in the order: loadings (L) followed by specific variances P; when k is greater than 1, constraints on the elements of L are supplied by ASReml, are related to those in FA by L= DF and P= DED
如果性状比较多时, 那么所需要估算的参数会显著减小, 比如10个性状, 使用多性状模型us(unstructure)矩阵, 需要:
N = 10 ( 10 1 ) 2 = 45 N = \frac{10*(10-1)}{2} = 45
那么如果使用FA1模型, 参数个数为:
N = N + N f = 10 + 10 1 = 20 N = N + N*f = 10 + 10*1 = 20
如果使用FA2模型, 参数个数为:
N = N + N f = 10 + 10 2 = 30 N = N + N*f = 10 + 10*2 = 30
也低于us矩阵. 从而更容易收敛, 最主要的是, 它可以比较容易的转化为G矩阵, 从而计算遗传相关和表型相关, 这也是FA模型相对于us的巨大优势.

缺点
The general limitations are that P may not include zeros except in the XFAk formulation constraints are required in L for kgt 1 for identifiability. Typically, one zero is placed in the second column, two zeros in the third column, etc. The total number of parameters fitted (kw + w - k(k-1)/2) may not exceed w(w+1)/2.

示例演示:

这里使用learnasreml包中的animalmodel.dat和animalmodel.ped数据, learnasreml是我编写的用于学习遗传评估软件asreml, DMU等的工具, 里面包括示例数据和代码.

如果没有按照软件, 可以使用下面命令进行安装:

library(devtools)

install_github("dengfei2013/learnasreml")

下面载入数据:

library(learnasreml)
library(asreml)
data("animalmodel.dat")
data("animalmodel.ped")
head(animalmodel.dat)

查看数据, 可以看到数据是两个性状的动物育种数据:

> head(animalmodel.dat)
  ANIMAL MOTHER BYEAR SEX   BWT TARSUS
1   1029   1145   968   1 10.77  24.77
2   1299    811   968   1  9.30  22.46
3    643    642   970   2  3.98  12.89
4   1183   1186   970   1  5.39  20.47
5   1238   1237   970   2 12.12   0.00
6    891    895   970   1  0.00   0.00

asremlw软件us代码:

!WORKSPACE 1 !RENAME !out !ARGS 1// !DOPART $1
Title: dat.
#`ANIMAL,MOTHER,BYEAR,SEX,BWT,TARSUS
#1029,1145,968,1,10.77,24.77
#1299,811,968,1,9.3,22.46
#643,642,970,2,3.98,12.89
#1183,1186,970,1,5.39,20.47
 ANIMAL  !P      # 1183
 MOTHER  !I      # 1186 
 BYEAR  !I      # 970 
 SEX  *       # 1 
 BWT  !M 0      # 5.39
 TARSUS !M 0       # 20.47
# Check/Correct these field definitions.
ped.csv !skip 1
dat.csv  !SKIP 1

!part 1
BWT TARSUS ~ Trait Trait.SEX ,         # Specify fixed model
      !r   us(Tr).ANIMAL        # Specify random model
residual units.us(Tr)

结果

 Model_Term                             Sigma         Sigma   Sigma/SE   % C
 units.us(Tr)                  1708 effects
 Tr                      US_V  1  1   2.99541       2.99541       7.16   0 P
 Tr                      US_C  2  1   3.59703       3.59703       4.30   0 P
 Tr                      US_V  2  2   17.7686       17.7686       6.67   0 P
 us(Tr).ANIMAL                 2618 effects
 Tr                      US_V  1  1   2.98698       2.98698       5.73   0 P
 Tr                      US_C  2  1   2.49405       2.49405       2.52   0 P
 Tr                      US_V  2  2   12.3960       12.3960       4.04   0 P
 ANIMAL                 NRM    1309
 Covariance/Variance/Correlation Matrix US Residual
   2.995      0.4930    
   3.597       17.77    
 Covariance/Variance/Correlation Matrix US us(Tr).ANIMAL
   2.987      0.4099    
   2.494       12.40  

可以看到结果中,
G11 = 2.98698
G12 = 2.49405
G22 = 12.3960
遗传相关为:
r12 = 0.4930

asremlw软件fa1代码:

!part 2
BWT TARSUS ~ Trait Trait.SEX ,         # Specify fixed model
      !r   xfa1(Tr).ANIMAL        # Specify random model
residual units.us(Tr)

结果:

 Model_Term                             Sigma         Sigma   Sigma/SE   % C
 units.us(Tr)                  1708 effects
 Tr                      US_V  1  1   2.99548       2.99548       7.23   0 P
 Tr                      US_C  2  1   3.59723       3.59723       4.41   0 P
 Tr                      US_V  2  2   17.7691       17.7691       6.73   0 P
 xfa1(Tr).ANIMAL               3927 effects
 Tr                     XFA_V  0  1   2.15127       2.15127       1.82   0 P
 Tr                     XFA_V  0  2   4.95302       4.95302       0.49   0 P
 Tr                     XFA_L  1  1  0.914111      0.914111       1.42   0 P
 Tr                     XFA_L  1  2   2.72806       2.72806       1.50   0 P
 ANIMAL                 NRM    1309
 Covariance/Variance/Correlation Matrix US Residual
   2.995      0.4931    
   3.597       17.77    
 Covariance/Variance/Correlation Matrix XFA xfa1(Tr).ANIMAL
   2.985      0.4093      0.5282    
   2.488       12.38      0.7748    
  0.9126       2.726       1.000    

asremlw版, 可以自动计算成G矩阵的形式, 比如这里:
G11 = 2.985
G12 = 2.488
G22 = 12.38
遗传相关为:0.409

可以看出两者结果一致.

这里如果是多个性状, 结果也类似, 但是更容易收敛.

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/yijiaobani/article/details/84593282
今日推荐