Deformable DETR进行目标检测,解决size mismatch问题

问题描述:

strict=False 但还是size mismatch for []: copying a param with shape [] from checkpoint,the shape in cur []

接着(6条消息) Deformable DETR环境配置和应用_Alaso_soso的博客-CSDN博客_deformable detr

 上面的链接进行继续写,发现很多人同样也遇到了,我遇到的这个问题,找到了解决方案,记录一下,或许也可以解决在训练自己的模型的时候出现的size问题不匹配的问题.

背景:

前期只是用官方给的完整的deformable detr 模型进行预测的,因此没有出现size不匹配的问题,后面报了一大堆类似一下的错误,网上的有的人用pop解决了问题,然而我却不ok,这次使用的预训练模型是r50_deformable_detr_single_scale-checkpoint.pth:

size mismatch for transformer.level_embed: copying a param with shape torch.Size([1, 256]) from checkpoint torch.Size([64, 256])

解决方案:

首先定位到detect.py文件:detect.py参考链接,修改model_path,以为可以直接运行,结果gg了:

model_path = './r50_deformable_detr_single_scale-checkpoint.pth'

Deformable-DETR部署和体验 - 简书 (jianshu.com)

 在定位到load_model方法:

def load_model(model_path, args):
    model, _, _ = build_model(args)
    model.cuda()
    model.eval()
    ckpt=state_dict = torch.load(model_path)  # <-----------修改加载模型的路径

    msg=model.load_state_dict(state_dict["model"],strict=False)
    model.to(device)
    print("load model sucess")
    return model

 在定位到 model, _, _ = build_model(args),这一句创建模型代码上:

def build_model(args):
    return build(args)

在定位到build(args)这里

def build(args):
    # 类别个数
    # num_class = 20
    # num_classes = 20 if args.dataset_file != 'coco' else (num_class + 1)
    num_classes = 20 if args.dataset_file != 'coco' else 91
    if args.dataset_file == "coco_panoptic":
        num_classes = 250
    device = torch.device(args.device)

    backbone = build_backbone(args)

    transformer = build_deforamble_transformer(args)
    model = DeformableDETR(
        backbone,
        transformer,
        num_classes=num_classes,
        num_queries=args.num_queries,
        num_feature_levels=args.num_feature_levels,
        aux_loss=args.aux_loss,
        with_box_refine=args.with_box_refine,
        two_stage=args.two_stage,
    )
    if args.masks:
        model = DETRsegm(model, freeze_detr=(args.frozen_weights is not None))
    matcher = build_matcher(args)
    weight_dict = {'loss_ce': args.cls_loss_coef, 'loss_bbox': args.bbox_loss_coef}
    weight_dict['loss_giou'] = args.giou_loss_coef
    if args.masks:
        weight_dict["loss_mask"] = args.mask_loss_coef
        weight_dict["loss_dice"] = args.dice_loss_coef
    # TODO this is a hack
    if args.aux_loss:
        aux_weight_dict = {}
        for i in range(args.dec_layers - 1):
            aux_weight_dict.update({k + f'_{i}': v for k, v in weight_dict.items()})
        aux_weight_dict.update({k + f'_enc': v for k, v in weight_dict.items()})
        weight_dict.update(aux_weight_dict)

    losses = ['labels', 'boxes', 'cardinality']
    if args.masks:
        losses += ["masks"]
    # num_classes, matcher, weight_dict, losses, focal_alpha=0.25
    criterion = SetCriterion(num_classes, matcher, weight_dict, losses, focal_alpha=args.focal_alpha)
    criterion.to(device)
    postprocessors = {'bbox': PostProcess()}
    if args.masks:
        postprocessors['segm'] = PostProcessSegm()
        if args.dataset_file == "coco_panoptic":
            is_thing_map = {i: i <= 90 for i in range(201)}
            postprocessors["panoptic"] = PostProcessPanoptic(is_thing_map, threshold=0.85)

    return model, criterion, postprocessors

根据自己的类别修改num_classes,args是参数配置,此时就定位到configs文件夹下对应的.sh文件了。

 注意到这两个配置文件的区别在于--num_feature_levels 1,问题就出在这里,需要在运行detect.py文件的时候添加上这个配置参数。

python detect.py --num_feature_levels 1

这里就结束了,可以正常运行,进行图片视频的预测了。

 感谢某人的帮助捏。*★,°*:.☆( ̄▽ ̄)/$:*.°★* 。

猜你喜欢

转载自blog.csdn.net/qq_44808827/article/details/126794548