R语言学习-基本统计分析--中

频数表和列联表

本节的数据来源于vcd包中的Arthritis数据集

> library(vcd)
载入需要的程辑包:grid
> head(Arthritis)
  ID Treatment  Sex Age Improved
1 57   Treated Male  27     Some
2 46   Treated Male  29     None
3 77   Treated Male  30     None
4 17   Treated Male  32   Marked
5 36   Treated Male  46   Marked
6 23   Treated Male  58   Marked
一维列联表

可以使用table()函数生成简单的频数统计表

> mytable<-table(Arthritis$Improved)
> mytable

  None   Some Marked 
    42     14     28 

使用prop.table()将这些频数转化为比例值

> prop.table(mytable)

     None      Some    Marked 
0.5000000 0.1666667 0.3333333 

或者使用prop.table()*100转化为百分比

> prop.table(mytable)*100

    None     Some   Marked 
50.00000 16.66667 33.33333 
二维列联表

对于二维列联表,table()的调用格式如下:
table(a,b) 其中a为行变量,b为列变量

> table(Arthritis$Treatment,Arthritis$Improved)
         
          None Some Marked
  Placebo   29    7      7
  Treated   13    7     21

还可以使用xtabs()函数,调用格式如下:
xtabs(~A+B,data=mydata)
其中mydata是一个矩阵或数据框,要进行交叉分类的变量写在~的右边

> xtabs(~Treatment+Improved,data = Arthritis)
         Improved
Treatment None Some Marked
  Placebo   29    7      7
  Treated   13    7     21

此外可以使用margin.table()和prop.table()函数分别生成边际频数和比例

> margin.table(mytable,1)#1代表第一个变量
Treatment
Placebo Treated 
     43      41 
> margin.table(mytable,2)#2代表第二个变量
Improved
  None   Some Marked 
    42     14     28 
> prop.table(mytable)
         Improved
Treatment       None       Some     Marked
  Placebo 0.34523810 0.08333333 0.08333333
  Treated 0.15476190 0.08333333 0.25000000
> prop.table(mytable,1)
         Improved
Treatment      None      Some    Marked
  Placebo 0.6744186 0.1627907 0.1627907
  Treated 0.3170732 0.1707317 0.5121951
> prop.table(mytable,2)
         Improved
Treatment      None      Some    Marked
  Placebo 0.6904762 0.5000000 0.2500000
  Treated 0.3095238 0.5000000 0.7500000

可以使用addmargins()函数可以为表格添加边际和

> addmargins(mytable)
         Improved
Treatment None Some Marked Sum
  Placebo   29    7      7  43
  Treated   13    7     21  41
  Sum       42   14     28  84    
> addmargins(prop.table(mytable))
         Improved
Treatment       None       Some     Marked        Sum
  Placebo 0.34523810 0.08333333 0.08333333 0.51190476
  Treated 0.15476190 0.08333333 0.25000000 0.48809524
  Sum     0.50000000 0.16666667 0.33333333 1.00000000
> addmargins(prop.table(mytable,1),2)
         Improved
Treatment      None      Some    Marked       Sum
  Placebo 0.6744186 0.1627907 0.1627907 1.0000000
  Treated 0.3170732 0.1707317 0.5121951 1.0000000
> addmargins(prop.table(mytable,2),1)
         Improved
Treatment      None      Some    Marked
  Placebo 0.6904762 0.5000000 0.2500000
  Treated 0.3095238 0.5000000 0.7500000
  Sum     1.0000000 1.0000000 1.0000000

使用gmodels包中的CrossTable()函数是创建二维列联表的第三种方法

> library(gmodels)
> library(vcd)
载入需要的程辑包:grid
> CrossTable(Arthritis$Treatment,Arthritis$Improved)

 
   Cell Contents
|-------------------------|
|                       N |
| Chi-square contribution |
|           N / Row Total |
|           N / Col Total |
|         N / Table Total |
|-------------------------|

 
Total Observations in Table:  84 

 
                    | Arthritis$Improved 
Arthritis$Treatment |      None |      Some |    Marked | Row Total | 
--------------------|-----------|-----------|-----------|-----------|
            Placebo |        29 |         7 |         7 |        43 | 
                    |     2.616 |     0.004 |     3.752 |           | 
                    |     0.674 |     0.163 |     0.163 |     0.512 | 
                    |     0.690 |     0.500 |     0.250 |           | 
                    |     0.345 |     0.083 |     0.083 |           | 
--------------------|-----------|-----------|-----------|-----------|
            Treated |        13 |         7 |        21 |        41 | 
                    |     2.744 |     0.004 |     3.935 |           | 
                    |     0.317 |     0.171 |     0.512 |     0.488 | 
                    |     0.310 |     0.500 |     0.750 |           | 
                    |     0.155 |     0.083 |     0.250 |           | 
--------------------|-----------|-----------|-----------|-----------|
       Column Total |        42 |        14 |        28 |        84 | 
                    |     0.500 |     0.167 |     0.333 |           | 
--------------------|-----------|-----------|-----------|-----------|
多维列联表

多维列联表的创建方式与二维列联表的方式差不多,具体代码如下:

#第一个变量是行变量,第二个变量是列变量,第三个变量是分组变量
> mytable<-xtabs(~Treatment+Improved+Sex,data = Arthritis)
> mytable
, , Sex = Female

         Improved
Treatment None Some Marked
  Placebo   19    7      6
  Treated    6    5     16

, , Sex = Male

         Improved
Treatment None Some Marked
  Placebo   10    0      1
  Treated    7    2      5
> ftable(mytable)
                   Sex Female Male
Treatment Improved                
Placebo   None             19   10
          Some              7    0
          Marked            6    1
Treated   None              6    7
          Some              5    2
          Marked           16    5
#计算每个变量的边际数
> margin.table(mytable,1)
Treatment
Placebo Treated 
     43      41 
> margin.table(mytable,2)
Improved
  None   Some Marked 
    42     14     28 
> margin.table(mytable,3)
Sex
Female   Male 
    59     25 
#两个变量组合的边际数
> margin.table(mytable,c(1,3))
         Sex
Treatment Female Male
  Placebo     32   11
  Treated     27   14
> margin.table(mytable,c(1,2))
         Improved
Treatment None Some Marked
  Placebo   29    7      7
  Treated   13    7     21
> ftable(prop.table(mytable,c(1,2)))
                   Sex    Female      Male
Treatment Improved                        
Placebo   None         0.6551724 0.3448276
          Some         1.0000000 0.0000000
          Marked       0.8571429 0.1428571
Treated   None         0.4615385 0.5384615
          Some         0.7142857 0.2857143
          Marked       0.7619048 0.2380952
> ftable(addmargins(prop.table(mytable,c(1,2)),3))
                   Sex    Female      Male       Sum
Treatment Improved                                  
Placebo   None         0.6551724 0.3448276 1.0000000
          Some         1.0000000 0.0000000 1.0000000
          Marked       0.8571429 0.1428571 1.0000000
Treated   None         0.4615385 0.5384615 1.0000000
          Some         0.7142857 0.2857143 1.0000000
          Marked       0.7619048 0.2380952 1.0000000 
> ftable(addmargins(prop.table(mytable,c(1,2)),3))*100
                   Sex    Female      Male       Sum
Treatment Improved                                  
Placebo   None          65.51724  34.48276 100.00000
          Some         100.00000   0.00000 100.00000
          Marked        85.71429  14.28571 100.00000
Treated   None          46.15385  53.84615 100.00000
          Some          71.42857  28.57143 100.00000
          Marked        76.19048  23.80952 100.00000

猜你喜欢

转载自blog.csdn.net/weixin_34240657/article/details/87789713