[Point Mutual Information PMI] of NLP - Measuring the correlation between two variables

introduction

In natural language processing, I want to explore whether there is a certain relationship between two words, for example: some words are more likely to appear together, and when these words appear together, they may carry some kind of information.

For example, in a news report, if there are New and York, these two words appear together, which can represent a place name New York, so when the word New appears, York may appear, which can be determined by Pointwise Mutual Information (PMI) Calculate the correlation of New and York appearing together.

1. The basic concept of PMI

Pointwise Mutual Information (PMI): In the related data of data mining or information retrieval, the index of PMI (Pointwise Mutual Information) is often used to measure the correlation between two things

Guess you like

Origin blog.csdn.net/weixin_42782150/article/details/127068069