labelme标注信息统计及使用方法

【原创声明】
本文为博主原创文章，未经博主允许不得转载。
更多算法总结请关注我的博客：https://blog.csdn.net/suiyingy。

近年来，随着计算机视觉的快速发展，图像标注成为了许多项目中不可或缺的一环。而针对图像标注任务，labelme是一款使用广泛的标注工具，它具有简单易用、功能强大的特点。本文将通过示例程序介绍如何统计labelme的标注结果，以用于标注检查或下游任务。

1. labelme简介

labelme是一款开源的图像标注工具，它基于Python语言开发，支持Windows、Linux和Mac等操作系统。它提供了直观的图形界面，允许用户在图像上标注多种类型的目标，比如矩形框、多边形、线条等。标注结果以JSON格式保存，便于后续处理和分析。

2. labelme标注信息统计

为了了解标注结果的分布情况，在使用labelme进行标注后，我们可以通过统计标注文件中的信息来获取每个类别的标注数量及其标注类别。这对于数据集的质量评估和后续算法模型的训练都非常有帮助。

本文示例程序使用Python编写了一个函数`get_ann_info()`来实现标注信息的统计。该函数首先获取标注文件夹下的所有标注文件，然后逐个读取文件并解析标注信息。通过遍历标注信息中的形状（shape），可以获取每个形状所属的类别和形状类型。最后，将统计结果保存在一个字典中，并调用`dict2table()`函数将统计结果以表格形式打印出来。

3. 使用方法

在使用本文提供的示例代码前，需要按照以下步骤来准备环境：

（1）安装Python 3.x版本。
（2）安装labelme库：使用命令`pip install labelme`来安装labelme库。
（3）准备标注文件夹：将待统计的标注文件放置在一个文件夹中。

接下来，按照以下步骤使用示例代码：

（1）打开示例程序文件，并在文件开头设置正确的标注文件夹路径。
（2）运行示例程序。
（3）程序运行完成后，将会输出每个类别的标注数量及其标注类别的表格形式。

通过以上步骤，你就可以方便地使用labelme和示例代码来进行标注信息的统计了。

4. 示例程序

# -*- coding: utf-8 -*-
'''
labelme标注信息统计，主要包括每个类别标注数量及其标注类别。
更多算法总结请关注博客：https://blog.csdn.net/suiyingy。
'''

import json
from glob import glob
from tqdm import tqdm
from pathlib import Path

# 将字典打印出表格形式进行显示
# indict：输入字典
# cell_name：单元格长度
# catename：类别说明
def dict2table(indict, cell_len = 15, catename = 'info/class'):
    t = catename
    table_info = {}
    # 列名
    table_info[t]= '|' + max(0, (cell_len - len(t)) // 2) * ' ' + t + max(0, cell_len - (cell_len - len(t)) // 2 - len(t)) * ' '
    # 行分割边框
    split_line = '-' * cell_len * (len(indict.keys()) + 1) + '-' * (len(indict.keys()) + 2) + '\n'
    row_names  = list(list(indict.values())[0].keys())
    # 行名
    for t in row_names:
        table_info[t] = '|' + max(0, (cell_len - len(t)) // 2) * ' ' + t + max(0, cell_len - (cell_len - len(t)) // 2 - len(t)) * ' '
    for k in indict.keys():
        table_info[catename] += '|' + max(0, (cell_len - len(k)) // 2) * ' ' + k + max(0, cell_len - (cell_len - len(k)) // 2 - len(k)) * ' '
        for row in row_names:
            t = str(indict[k][row])
            table_info[row] += '|' + max(0, (cell_len - len(t)) // 2) * ' ' + t + max(0, cell_len - (cell_len - len(t)) // 2 - len(t)) * ' '
    table_info[catename] += '|\n' + split_line
    for row in row_names:
        table_info[row] += '|\n' + split_line
    outputs = split_line + ''.join(list(table_info.values()))
    return outputs

# 获取labelme标签数量给及其标注类别
# root_dir：标注所在文件夹路径
def get_ann_info(root_dir):
    root_dir = str(Path(root_dir)) + '/'
    label_files = glob(root_dir + '*.json')
    ann_infos = {}
    for label_path in tqdm(label_files):
        anns = json.load(open(label_path, "r", encoding="utf-8"))
        for shape in anns['shapes']:
            slabel = shape['label']
            stype  = shape['shape_type']
            if slabel not in ann_infos.keys():
                ann_infos[slabel] = {'count': 1, 'shape_type': [stype]}
            else:
                ann_infos[slabel]['count'] += 1
                if stype not in ann_infos[slabel]['shape_type']:
                    ann_infos[slabel]['shape_type'].append(stype)

    print(ann_infos)
    outputs = dict2table(ann_infos)
    print(outputs)
    return ann_infos


if __name__ == '__main__':
    root_dir = r'aaa/bbb/'
    anns = get_ann_info(root_dir)
    print(list(anns.keys()))

5. 总结

本文介绍了labelme标注工具的基本情况，并通过示例程序演示如何使用labelme进行标注信息的统计。标注信息的统计对于数据集管理和算法模型的训练都是非常重要的工作，它可以帮助我们更好地了解数据的分布情况，为后续的工作提供参考和指导。

在使用labelme进行标注时，还可以根据具体的需求进行扩展和定制，比如添加新的标注工具、导出其他格式的标注结果等。相信通过学习本文，你已经掌握了labelme的基本使用方法，并了解了如何统计标注信息。希望本文能够对你在图像标注任务中遇到的问题有所帮助！

更多算法总结请关注我的博客：https://blog.csdn.net/suiyingy。如果有任何疑问或建议，请随时在下方留言，谢谢大家阅读！
【原创声明】
本文为博主原创文章，未经博主允许不得转载。
更多算法总结请关注我的博客：https://blog.csdn.net/suiyingy。