Data annotation engineering notes

Environmental preparation
Elf annotation assistant

Elf Annotation Assistant-Artificial Intelligence Data Set Annotation Tool (jinglingbiaozhu.com)

labelimg

pip install labelimg
Data collection and annotation: manual work

Export

Data collection and export formats: xml, json, MongoDB, pascal-voc

The collected data is labeled data in the network direction.

xml

Created when network labeled data and network transmission are underdeveloped

However, the convenient structure is still applicable to data processing.

Can be used as a structure or class

You can define exclusive tags according to your own design needs

Tag language: expression form of network language, java, html, xml

Introduction to XML - XML ​​(Extensible Markup Language) | MDN (mozilla.org)

XML format data collected by Elf Markup Assistant
<!-- 矩形框采集 -->
<!-- 标记失败 -->
<?xml version="1.0" ?>
<doc>
	<path>D:\yyqh\DataSet\set1\8805d9c7c825a211eacec94f37b871e9.jpeg</path>
	<outputs></outputs>
	<time_labeled>0</time_labeled>
	<labeled>false</labeled>
</doc>


<!-- 标记成功,但识别失败 -->
<?xml version="1.0" ?>
<doc>
	<path>D:\yyqh\DataSet\set1\data (1).jpeg</path>
	<outputs>
		<object></object>
	</outputs>
	<time_labeled>1695689497928</time_labeled>
	<labeled>true</labeled>
	<size>
		<width>1000</width>
		<height>1506</height>
		<depth>3</depth>
	</size>
</doc>


<!-- 成功 -->
<?xml version="1.0" ?>
<doc>
	<path>D:\yyqh\DataSet\set1\data (2).jpeg</path>
	<outputs>
		<object>
			<item>
				<name>猫</name>
				<bndbox>
					<xmin>10</xmin>
					<ymin>-1</ymin>
					<xmax>974</xmax>
					<ymax>1761</ymax>
				</bndbox>
			</item>
		</object>
	</outputs>
	<time_labeled>1695689802263</time_labeled>
	<labeled>true</labeled>
	<size>
		<width>1000</width>
		<height>1778</height>
		<depth>3</depth>
	</size>
</doc>
<!-- 曲形框/锚点采集 -->
<?xml version="1.0" ?>
<doc>
	<path>D:\yyqh\DataSet\set1\data (4).jpeg</path>
	<outputs>
		<object>
			<item>
				<name>柠檬</name>
				<cubic_bezier>
					<x57>25</x57>
					<y57>505</y57>
					<x57_c1>25</x57_c1>
					<y57_c1>505</y57_c1>
					<x57_c2>25</x57_c2>
					<y57_c2>505</y57_c2>

					<!-- 这一部分是比较复杂的坐标,所以省略 -->

					<x1>25</x1>
					<y1>505</y1>
					<x1_c1>25</x1_c1>
					<y1_c1>505</y1_c1>
					<x1_c2>25</x1_c2>
					<y1_c2>505</y1_c2>
				</cubic_bezier>
			</item>
		</object>
	</outputs>
	<time_labeled>1695690362440</time_labeled>
	<labeled>true</labeled>
	<size>
		<width>1080</width>
		<height>757</height>
		<depth>3</depth>
	</size>
</doc>

# Can’t directly select the entire image?

pascal-voc

Target detection data set

csdn: Introduction to PASCAL VOC data set

MonogoDB

Database based on distributed file storage. Written by C++language.

csdn: Detailed explanation of MongoDB, just read this article carefully [Key Points]

Database interaction, network sharing

vb # Being eliminated? But practical in some respects

python # Not popular anymore

Virtual simulation# cannot be studied

data structure, stack

Data annotation crowdsourcing platform—Shujiajia

Shujiajia-a crowdsourcing platform under Datatang: massive data collection and annotation tasks (shujiajia.com)

github

GitHub: Let’s build from here · GitHub

Guess you like

Origin blog.csdn.net/qq_51943845/article/details/133293413