信息科学原理第一章(香农熵,条件熵,相对熵)

@(信息科学原理)

导论

香农熵

信息: h ( x ) = log p ( x )

H ( X , Y ) = x X P ( x ) log ( P ( x ) ) = E x P l o g ( P ( x ) )

其中 0 log 0 = 0 ,并且定义 log 1 e = 1 n a t s log 1 2 = 1 b i t s

联合熵

H ( X , Y ) = x X , y Y P ( x , y ) log P ( x , y ) = E x P log P ( x , y )

互信息

I ( X , Y ) = x X , y Y P ( x , y ) log P ( x , y ) P ( X ) P ( Y ) = E x , y P log P ( x , y ) P ( X ) P ( Y ) = D K L ( P ( x , y ) ∣∣ P ( X ) P ( Y ) )

扫描二维码关注公众号,回复: 2565539 查看本文章

衡量两个信息的相关性大小的量

条件熵

H ( Y | X ) = x X , y Y P ( x , y ) log P ( y | x ) = x X , y Y P ( x , y ) log P ( x , y ) P ( x ) = x X , y Y P ( x , y ) log P ( x ) P ( x , y ) = E x , y P l o g P ( x ) P ( x , y )

知道的信息越多,随机事件的不确定性就越小

proof: H ( X , Y ) = H ( X ) + H ( Y | X ) :

H ( X , Y ) = x X , y Y P ( x , y ) log P ( x , y ) = x X , y Y P ( x , y ) log [ P ( y | x ) P ( x ) ] = x X , y Y P ( x , y ) [ log P ( y | x ) + log P ( x ) ] = x X , y Y P ( x , y ) log P ( y | x ) + [ x X P ( x ) log P ( x ) ] = H ( Y | X ) + H ( x )

proof: H ( X , Y | Z ) = H ( X | Z ) + H ( Y | X , Z )
H ( X , Y | Z ) = x , y , z P ( x , y , z ) log P ( x , y | z ) = x , y , z P ( x , y , z ) log [ P ( x , y , z ) P ( z ) ] = x , y , z P ( x , y , z ) log [ P ( x , y , z ) P ( x , z ) P ( x , z ) p ( z ) ] = [ x , y , z P ( x , y , z ) log P ( x , y , z ) P ( x , z ) ] + [ x , y , z P ( x , y , z ) log P ( x , z ) P ( z ) ] = [ x , y , z P ( x , y , z ) log P ( x , y , z ) P ( x , z ) ] + [ x , z P ( x , z ) log P ( x , z ) P ( z ) ] = H ( Y | X , Z ) + H ( X | Z )

相对熵(KL-散度)

D K L ( P ∣∣ Q ) = x X P ( x ) log P ( x ) Q ( x ) = E x P [ log P ( x ) Q ( x ) ] = E x P [ log P ( x ) log Q ( x ) ]

note: D K L ( P ∣∣ Q ) 0 ,用于衡量两个分布的相似性

交叉熵

H ( P , Q ) = H ( P ) + D K L ( P ∣∣ Q ) H ( P , Q ) = E x P log Q ( x )

边缘概率,条件概率,联合概率

Alt text
- 边缘概率就是计算每一边
- 联合概率计算的是 P ( X = x , Y = y ) = P ( y | x ) P ( x )

- 条件概率计算的是 P ( y | x ) = P ( x , y ) P ( x )

对于离散的随机变量:
Alt text

对于连续的随机变量:
Alt text

example

H ( X ) = x X P ( x ) log p ( x ) = 1 2 log 2 + 1 4 log 4 + 1 8 log 8 + 1 8 log 8 = 7 4 log 2 = 7 4 b i t s

H ( X | Y ) = x X y Y P ( x , y ) l o g P ( x , y ) P ( y ) = 4 32 log 1 / 4 4 / 32 + 2 32 log 1 / 4 2 / 32 + 2 32 log 1 / 4 2 / 32 + = 11 8 b i t s

H ( X , Y ) = x X y Y P ( x , y ) l o g P ( x , y ) = 27 8 b i t s

猜你喜欢

转载自blog.csdn.net/c654528593/article/details/81410284