Table of contents
Introduction:
Label Studio is an open source data labeling tool that can be used in various machine learning and deep learning projects. Its main purpose is to help data scientists and machine learning engineers label data quickly and efficiently to build and train accurate machine learning models. Label Studio supports labeling of various data types, such as image, text, audio and video, etc., and also provides many functions and tools, such as label management, annotation, team collaboration, data visualization and automation, etc. Label Studio is an open source software developed and maintained by MindsDB, which provides complete documentation and code on GitHub.
1. Environmental preparation
I use pycharm, configure the following environment in anaconda and enter the environment:
- Python 3.8+
- label-studio == 1.7.1
- paddleocr >= 2.6.0.1
After configuration, enter in the console
label-studio start
Then it will automatically log in to the web page, usually http://localhost:8080/ . Register as a new user
Two, operation
After entering, click Create Project
Choose a good name, choose natural language processing, and then choose named entity recognition
In the box on the left, you can edit the keywords you want to label
Click import to import a text file, I just made one now
Label words according to your own requirements
After importing, click above to perform labeling tasks
Mark one by one.
Finally, you can export after marking, you can use json, csv, etc.