pyhanlp installation tutorial

1, hanlp Profile

HanLP is composed of a series of models and algorithms Java toolkit aims to popularize the application of natural language processing in a production environment. HanLP with perfect features, performance efficiency, clear structure, when the corpus new, customizable features.

Hanlp has the following features:

  • Chinese word
  • Speech tagging
  • Named entity recognition
  • Dependency parsing
  • Keyword extraction discover new words
  • Phrase extraction
  • Automatic Summary
  • Text Categorization
  • Simplified and Traditional Pinyin

2, hanlp installation

Step: hanlp Providing python library modules, the system needs to command box: enter the following code (ctrl + r, enter cmd), the library can be installed pyhanlp

pip install pyhanlp

Step two: hanlp library dependencies packet, so to successfully use also need to download specific data packets, packets need to have: data-for-1.7.7.zip (latest edition),

data Download: https://github.com/hankcs/HanLP/releases

Once downloaded, the file into the directory, according to the directory where you install python compiler may be, as my directory is: E: \ tool \ python \ Lib \ site-packages \ pyhanlp \ static, can not find your path in the command box again to re-enter the installation command, you can. Note Once you have downloaded the data packet, without decompression, directly into your directory. Enter the following code:

from python import*

 Run, can automatically extract, after a successful start the test.

3, hanlp function test

Enter the simple test code, test hanlp features:

from pyhanlp Import * 

sentence = " I like being a writer, to write the kind of book to your favorite writer, writing allows the writer of the book many readers seem unable to stop, write the kind of book writer hearty " 

Terms = HanLP .SEGMENT (sentence)
 Print (Terms)

The results show:

Output: [I / rr, like / vi, when / p, a / q, writer / nnt,, / w, sort / r, write / v, own / rr, watch / v, a / ude1, book / n, the / ude1, writer / nnt,, / w, write / v, can / v, let / v, lot / m, the reader / n, it seems / v, unable to stop / vl, the / ude1, book / n, the / ude1, writer / nnt,, / w, write / v, that / r, hearty / al, the / ude1, books / n, the / ude1, writer / nnt]

4, hanlp reference documentation

pyhanlp reference documentation: https://github.com/hankcs/pyhanlp

hanlp reference documentation: https://github.com/hankcs/HanLP/blob/master/README.md

5 Notes

pynlp and hanlp are hanlp's segmentation, POS tagging tool, HanLP is a Java toolkit is based on python python toolkit, if it is compiled with pycharm python, install pyhanlp enough.

Guess you like

Origin www.cnblogs.com/maxxu11/p/12594387.html