What is the dictionary of jieba - Code World

What is the dictionary of jieba

Language 2023-09-30 05:59:36 views: null

person github

jiebaUse a built-in dictionary for word segmentation and part-of-speech tagging. This dictionary is generally saved as a text file, and each line contains an entry and some information related to it, such as word frequency and part of speech.

structure

A typical jieba dictionary has the following format:

词语 词频 词性

For example:

清华大学 2333 nt
自然语言处理 1012 n

词语is the word to be recognized in the dictionary.
词频is a numerical value indicating the frequency of occurrence of the word in the corpus. jieba uses this information during word segmentation to determine the "importance" of a word.
词性is an identifier that represents the part of speech (noun, verb, etc.) of the word. This field is used during part-of-speech tagging.

Custom dictionary

In addition to using the default built-in dictionary, jiebausers are also allowed to load custom dictionaries:

jieba.load_userdict("userdict.txt")

Custom dictionaries have the same format as built-in dictionaries. By loading a custom dictionary, you can override entries in the built-in dictionary or add new entries to more accurately reflect the vocabulary of a specific application or domain.

Modify and extend

jieba's dictionary can also be modified dynamically, for example:

jieba.add_word("特定词", freq=1000, tag="n")
jieba.del_word("不需要的词")

This way, you can add or remove vocabulary as needed while the program is running.

Overall, jiebathe dictionary is a flexible and extensible component that can support a variety of different Chinese text processing needs.

Guess you like

Origin blog.csdn.net/m0_57236802/article/details/133393061

What is the dictionary of jieba

What is a dictionary?

What is a data dictionary?

What is the python-jieba library and how to use it

Comic: What is the dictionary order algorithm?

Quickly understand what a dictionary in Redis is

Quickly understand what a dictionary in Redis is

day1- project - What dictionary?

Dictionary

dictionary{}

dictionary

Dictionary feature extraction, text feature extraction, jieba word segmentation processing, tf-idf text feature extraction concept and code implementation

Find which element of list is a key in a dictionary and what is it's value

What you don't know about Dictionary and ConcurrentDictionary

Tencent sub-sub-group of three groups of six color What do you mean what the dictionary

jieba principle

What are the ways to learn Python's dictionary from 0 to 1 [Lebo TestPro]

On the sixth day, Dictionary Dictionary

Redis in the dictionary (Dictionary) to achieve

Dictionary dictionary tool

Disordered and ordered dictionary dictionary

C # Dictionary Dictionary usage

Dictionary + Dictionary cycle

Learn python dictionary (dictionary)

Add dictionary to python dictionary

jieba source parsing (B): jieba.cut

The tenth day of summing up the python Python beginning to learn what [another data dictionary objects -----] (novice can supervise each other)

The python dictionary / dictionary of key / attribute dictionary

jieba GitHUb stuttered word

Jieba as word integrated with Elasticsearch

Recommended

NYSE technical issues send Berkshire Hathaway (BRK.A) down nearly 100%

Ranking

Why is BufferedReader read() much slower than readLine()?

postman Interface learning from entry to the master

Create function example

Bitmap compression strategy

Day05 common data type built-in method

統一されたセマンティック表現に基づくマルチユーザー異種セマンティックネットワーク

The "4+1" view of the system architect's architecture and the "4+1" view of UML

[User Tutorial] Migrating DGPals to Cronos Protocol

Summer training topics

Version 3.2.0 preview! Remote logs solve the problem of unable to obtain logs due to Worker failure

Daily

More

2024-06-03(1)

2024-06-02(0)

2024-06-01(1)

2024-05-31(1)

2024-05-30(0)

2024-05-29(1)

2024-05-28(1)

2024-05-27(1)

2024-05-26(0)

2024-05-25(3)