Use jieba library entry of python

  For some English, if desired extracts words therein, only you need to use the split string processing () method, for example "China is a great country".

 

 

 However, for the Chinese text, the lack of a separator between the Chinese word, which is similar to Chinese language and unique "word problems."

  jieba ( "stutter") python is an important third-party Chinese word library. jieba library is third-party libraries, not python installation package comes, therefore, we need to be installed pip instruction.

Use the command to install Windows: In networking mode, enter the command line  pip install jieba installation, the installation is complete will be prompted to install successfully.

 

  • Word of three modes jieba 

           Precision mode, full mode, search engine mode 

           exact model: separating text precise cut, there is no redundancy word
           -  full mode: all possible words in the text are scanned, redundant

           -  Search engine mode: the precise mode on the basis of long-term re-segmentation

  •  jieba library of commonly used functions

 

 

  •  For example as follows

 jieba._lcut ( "People's Republic of China is a great country.")

jieba._lcut ( "People's Republic of China is a great country", cut_all = True)

jieba._lcut_for_search ( "People's Republic of China is a great country.")

 operation result:

 

Guess you like

Origin www.cnblogs.com/DrcProgrammingCool/p/11700116.html