yolov8 No labels found in /path/train.cache solution

When training with ultralytics/datasets/coco.yaml in yolov8, the error no labels found in train2017.cache occurred.

model.train(
        data="ultralytics/datasets/coco.yaml",
        epochs=100,
        imgsz=640,
        batch=16,
        save_period=10,
    )

Find a reason for this problem below.

The coco data set is not downloaded in advance here. It is automatically downloaded during training, and the download speed is relatively slow.
Then a coco folder will appear under datasets (if the folder does not appear here, please check ~/.config/Ultralytics/settings.yamlthe path settings in).

Please add image description

train.cache is in the labels folder. It is not downloaded, but generated after finding the corresponding file in labels based on the train and val pictures in images.
So we need to see what went wrong during this generation process.

First check whether the paths to images and labels are correct.
Check label_files in ultralytics/yolo/data/dataset.pyto see if the file exists.
(This step can already solve the problem in many articles on the Internet)

    def get_labels(self):
        """Returns dictionary of labels for YOLO training."""
        self.label_files = img2label_paths(self.im_files)
        #print(self.label_files)
        cache_path = Path(self.label_files[0]).parent.with_suffix('.cache')

In this step, the blogger checks that these files exist in the folder. We still need to continue to find the reason.
The error is reported here.

        # Display cache
        nf, nm, ne, nc, n = cache.pop('results')  # found, missing, empty, corrupt, total
        if exists and LOCAL_RANK in (-1, 0):
            d = f'Scanning {
      
      cache_path}... {
      
      nf} images, {
      
      nm + ne} backgrounds, {
      
      nc} corrupt'
            tqdm(None, desc=self.prefix + d, total=n, initial=n, bar_format=TQDM_BAR_FORMAT)  # display cache results
            if cache['msgs']:
                LOGGER.info('\n'.join(cache['msgs']))  # display warnings
        if nf == 0:  # number of labels found
            raise FileNotFoundError(f'{
      
      self.prefix}No labels found in {
      
      cache_path}, can not start training. {
      
      HELP_URL}')

Obviously nf=0 appears. Why is nf 0? It is very likely that there is a problem with the cache during the generation process. Next, you need to find the code generated by the cache.
Still dataset.py
doing the +1 operation here on nf. It is obvious that something went wrong and the +1 operation was not performed.

    def cache_labels(self, path=Path('./labels.cache')):
        """Cache dataset labels, check images and read shapes.
        Args:
            path (Path): path where to save the cache file (default: Path('./labels.cache')).
        Returns:
            (dict): labels.
        """
        x = {
    
    'labels': []}
        nm, nf, ne, nc, msgs = 0, 0, 0, 0, []  # number missing, found, empty, corrupt, messages
        desc = f'{
      
      self.prefix}Scanning {
      
      path.parent / path.stem}...'
        total = len(self.im_files)
        nkpt, ndim = self.data.get('kpt_shape', (0, 0))
        if self.use_keypoints and (nkpt <= 0 or ndim not in (2, 3)):
            raise ValueError("'kpt_shape' in data.yaml missing or incorrect. Should be a list with [number of "
                             "keypoints, number of dims (2 for x,y or 3 for x,y,visible)], i.e. 'kpt_shape: [17, 3]'")
        with ThreadPool(NUM_THREADS) as pool:
            results = pool.imap(func=verify_image_label,
                                iterable=zip(self.im_files, self.label_files, repeat(self.prefix),
                                             repeat(self.use_keypoints), repeat(len(self.data['names'])), repeat(nkpt),
                                             repeat(ndim)))
            pbar = tqdm(results, desc=desc, total=total, bar_format=TQDM_BAR_FORMAT)
            for im_file, lb, shape, segments, keypoint, nm_f, nf_f, ne_f, nc_f, msg in pbar:
                nm += nm_f
                nf += nf_f
                ne += ne_f
                nc += nc_f
                if im_file:
                    x['labels'].append(
                        dict(
                            im_file=im_file,
                            shape=shape,
                            cls=lb[:, 0:1],  # n, 1
                            bboxes=lb[:, 1:],  # n, 4
                            segments=segments,
                            keypoints=keypoint,
                            normalized=True,
                            bbox_format='xywh'))
                if msg:
                    msgs.append(msg)
                pbar.desc = f'{
      
      desc} {
      
      nf} images, {
      
      nm + ne} backgrounds, {
      
      nc} corrupt'
            pbar.close()

Here I found that the pictures in the trains folder could not be found. I
checked ultralytics/datasets/coco/images/train2017the folder and found that the folder was empty.
The size of the second look train2017.zipis only 2.2kB! The download failed.
The problem turned out to be that there was no confirmation whether the downloaded file was damaged.

Guess you like

Origin blog.csdn.net/level_code/article/details/132538501