[Source code] Traffic light recognition system: improving YOLO and OpenCV

1. Research background and significance

With the continuous development of urban traffic and the increasing number of vehicles, traffic light recognition systems have become increasingly important. The accurate identification of traffic lights is of great significance to fields such as traffic management, intelligent transportation systems, and autonomous driving. However, accurate recognition of traffic lights has been a challenging problem due to changes in the shape, color, and lighting conditions of traffic lights, as well as complex traffic scenes.

Currently, the target detection algorithm YOLO (You Only Look Once) based on deep learning has achieved remarkable results in the field of image recognition. However, for the recognition of traffic lights, the YOLO algorithm still has certain limitations in terms of accuracy and real-time performance. In addition, OpenCV, as an open source computer vision library, provides a wealth of image processing and computer vision algorithms, but its application in traffic light recognition is still relatively limited.

Therefore, this research aims to improve the traffic light recognition system of YOLO and OpenCV, improve its accuracy and real-time performance, and provide deployment tutorials and source code so that more researchers and engineers can use and improve the system.

The significance of this study is mainly reflected in the following aspects:

  1. Improve the accuracy of traffic light recognition: By improving the YOLO algorithm, introducing more training data and optimizing the network structure, the accuracy of traffic light recognition can be improved. Accurate traffic light recognition can provide more reliable data support for traffic management and intelligent transportation systems, thereby improving traffic efficiency and safety.

  2. Improve the real-time performance of traffic light recognition: By optimizing algorithms and using high-performance hardware equipment, the real-time performance of the traffic light recognition system can be improved. Real-time traffic light recognition can provide timely decision-making basis for the autonomous driving system, thereby improving the safety and reliability of autonomous driving.

  3. Provide deployment tutorials and source code: By providing deployment tutorials and source code, we can help more researchers and engineers quickly understand and use the traffic light recognition system. This will promote the dissemination and application of traffic light recognition technology and contribute to the development of traffic management and intelligent transportation systems.

In short, improving the traffic light recognition system of YOLO and OpenCV has important practical significance and application value. By improving recognition accuracy and real-time performance, and providing deployment tutorials and source code, we can promote the development of traffic light recognition technology and provide better solutions for traffic management, intelligent transportation systems, autonomous driving and other fields.

2. Picture demonstration

2.png

3.png

4.png

3. Video demonstration

Improving the traffic light recognition system of YOLO and OpenCV (deployment tutorial & source code)_bilibili_bilibili

4.Introduction to BiFPN

In this section, we first elaborate on the multi-scale feature fusion problem, and then introduce the main ideas of our proposed BiFPN: effective bidirectional cross-scale connection and weighted feature fusion.

4.1 Problem statement

Multi-scale feature fusion aims to aggregate features at different resolutions. Formally, a list of multi-scale features is given
in = (Pin, Pin,…)
pl represents the features of the layer, in represents the output, and out Represent output
Our goal is to find a transformation that efficiently aggregates different features and outputs a new feature list
f:out = f( pin)
As shown below:
image.png

4.2 Cross-scale connections

This blog pointed out the importance of feature fusion between different layers in FPN of CVPR 2017, and used a relatively simple, Heuristic method to multiply the underlying features by twice Add to shallow layers to blend. Afterwards, people also tried various other fusion methods. For example, PANet first connected from bottom to top, and then connected back from top to bottom (b above); NAS-FPN uses neural architecture search to search for better cross-scale feature networks. Difficult topology, but thousands of GPU hours are spent during the search, and the discovered network is irregular and difficult to interpret or modify, as shown in (c) above. In short, the above are some artificial designs for various connections, including candidate operations such as Conv, Sum, Concatenate, Resize, and Skip Connection. It is obvious which operations are used and the order between operations can be searched using NAS.
By studying the performance and efficiency of these three networks (as shown in the table below), we observe that PANet has better accuracy than FPN and NAS-FPN, but requires more parameters and calculations.
image.png
PANet is better than FPN and NAS-FPN, and the calculation cost is also higher;
In order to improve the efficiency of the model, the paper proposes several cross-scale connections Optimization method:
First, we delete those nodes with only one input edge. We believe that if a node has only one input edge without feature fusion, then its contribution to the feature network that fuses different features will be smaller. So we removed the intermediate nodes of P3 and P7 in PANet, which resulted in a simplified bidirectional network;
image.png
Secondly, we added a skip connection from the input node to the output at the same scale Adding a skip connection between nodes, because they are on the same layer, fuses more features without increasing too much computational cost. Get the above figure (d);
Finally, unlike PANet, which has only one top-down and one bottom-up path, we divide each bidirectional path (top-down and bottom-up ) path is regarded as a feature network layer (repeated blocks), and the same layer is repeated multiple times to achieve higher-level feature fusion.
Section 7 will discuss how to use YOLOv5 to introduce BIFPN. After these optimizations, we named the new feature network Bidirectional Feature Pyramid Network (BiFPN).

4.3 Weighted Feature Fusion

When fusing features with different resolutions, a common approach is to first resize them to the same resolution and then fuse them. The Pyramid Attention Network [Hanchao Li, Pengfei Xiong, Jie An, and Lingxue Wang. Pyramidattention networks.BMVC, 2018] introduces global self-attention upsampling to restore pixel positioning, and has been further studied in Nas-fpn. All previous methods treat all input features equally. However, we observe that since different input features have different resolutions, their contribution to the output features is usually unequal. To understand this trend, Huayou proposed adding an extra weight to each input and letting the network learn the importance of each input feature. Based on this idea, three weighted fusion methods were considered.

5. Core code explanation

5.1 Interface.py

class YOLOv5Detector:
    def __init__(self, weights, data, device='', half=False, dnn=False):
        self.weights = weights
        self.data = data
        self.device = device
        self.half = half
        self.dnn = dnn

    def load_model(self):
        FILE = Path(__file__).resolve()
        ROOT = FILE.parents[0]  # YOLOv5 root directory
        if str(ROOT) not in sys.path:
            sys.path.append(str(ROOT))  # add ROOT to PATH
        ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relative

        # Load model
        device = self.select_device(self.device)
        model = self.DetectMultiBackend(self.weights, device=device, dnn=self.dnn, data=self.data)
        stride, names, pt, jit, onnx, engine = model.stride, model.names, model.pt, model.jit, model.onnx, model.engine

        # Half
        half &= (pt or jit or onnx or engine) and device.type != 'cpu'  # FP16 supported on limited backends with CUDA
        if pt or jit:
            model.model.half() if half else model.model.float()
        return model, stride, names, pt, jit, onnx, engine

    def run(self, model, img, stride, pt, imgsz=(640, 640), conf_thres=0.25, iou_thres=0.45, max_det=1000,
            device='', classes=None, agnostic_nms=False, augment=False, half=False):
        cal_detect = []

        device = self.select_device(device)
        names = model.module.names if hasattr(model, 'module') else model.names  # get class names

        # Set Dataloader
        im = self.letterbox(img, imgsz, stride, pt)[0]

        # Convert
        im = im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
        im = np.ascontiguousarray(im)

        im = torch.from_numpy(im).to(device)
        im = im.half() if half else im.float()  # uint8 to fp16/32
        im /= 255  # 0 - 255 to 0.0 - 1.0
        if len(im.shape) == 3:
            im = im[None]  # expand for batch dim

        pred = model(im, augment=augment)

        pred = self.non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)
        # Process detections
        for i, det in enumerate(pred):  # detections per image
            if len(det):
                # Rescale boxes from img_size to im0 size
                det[:, :4] = self.scale_coords(im.shape[2:], det[:, :4], img.shape).round()

                # Write results
                for *xyxy, conf, cls in reversed(det):
                    c = int(cls)  # integer class
                    label = f'{
      
      names[c]}'
                    cal_detect.append([label, xyxy])
        return cal_detect

    def detect(self):
        model, stride, names, pt, jit, onnx, engine = self.load_model()   # 加载模型
        image = cv2.imread("./images/1.jpg")   # 读取识别对象
        results = self.run(model, image, stride, pt)   # 识别, 返回多个数组每个第一个为结果,第二个为坐标位置
        for i in results:
            box = i[1]
            p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
            print(i[0])
            cv2.rectangle(image, p1, p2, (0, 255, 0), thickness=3, lineType=cv2.LINE_AA)
        cv2.imshow('image', image)
        cv2.waitKey(0)

    @staticmethod
    def select_device(device):
        # ...
        pass

    @staticmethod
    def DetectMultiBackend(weights, device, dnn, data):
        # ...
        pass

    @staticmethod
    def letterbox(img, imgsz, stride, pt):
        # ...
        pass

    @staticmethod
    def non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det):
        # ...
        pass

    @staticmethod
    def scale_coords(shape, coords, img_shape):
        # ...
        pass

The program file is named Interface.py, and its main function is to load the YOLOv5 model and perform target detection.

The program first imports the necessary libraries and modules, including os, sys, pathlib, cv2, torch, etc. Then some constants and global variables are defined.

The load_model function is used to load the model, where the weights parameter specifies the path to the model weight file, the data parameter specifies the path to the data set configuration file, the device parameter specifies the device type, the half parameter specifies whether to use FP16 half-precision inference, and the dnn parameter specifies whether ONNX inference using OpenCV DNN. This function returns the loaded model and some related information.

The run function is used to run the model for target detection, where the model parameter is the loaded model, the img parameter is the input image, the stride parameter is the step size of the model, the pt parameter specifies whether to use the PyTorch model, the imgsz parameter specifies the image size for inference, conf_thres The parameter specifies the confidence threshold, the iou_thres parameter specifies the IOU threshold of NMS, the max_det parameter specifies the maximum number of detections per image, the device parameter specifies the device type, the classes parameter specifies the categories to be filtered, and the agnostic_nms parameter specifies whether to use a category-agnostic NMS. The augment parameter specifies whether to perform data augmentation, and the half parameter specifies whether to use FP16 half-precision inference. This function returns the detected target result.

The detect function is used to load the model and perform target detection. The load_model function is called to load the model, then the image to be detected is read, and finally the run function is called to perform target detection and the results are drawn on the image for display.

Finally, call the detect function for target detection.

5.2 main.py

class ObjectDetector:
    def __init__(self, weights_path, data_path):
        self.weights_path = weights_path
        self.data_path = data_path
        self.model = None
        self.stride = None
        self.names = None
        self.pt = None
        self.jit = None
        self.onnx = None
        self.engine = None

    def load_model(self):
        device = self.select_device('')
        model = self.DetectMultiBackend(self.weights_path, device=device, dnn=False, data=self.data_path)
        self.model = model
        self.stride, self.names, self.pt, self.jit, self.onnx, self.engine = model.stride, model.names, model.pt, model.jit, model.onnx, model.engine

    def run(self, img_path):
        img = cv2.imread(img_path)
        cal_detect = []

        device = self.select_device('')
        names = self.model.module.names if hasattr(self.model, 'module') else self.model.names

        im = self.letterbox(img, (640, 640), self.stride, self.pt)[0]
        im = im.transpose((2, 0, 1))[::-1]
        im = np.ascontiguousarray(im)
        im = torch.from_numpy(im).to(device)
        im = im.half() if False else im.float()
        im /= 255
        if len(im.shape) == 3:
            im = im[None]

        pred = self.model(im, augment=False)
        pred = self.non_max_suppression(pred, 0.35, 0.05, classes=None, agnostic_nms=False, max_det=1000)

        for i, det in enumerate(pred):
            if len(det):
                det[:, :4] = self.scale_coords(im.shape[2:], det[:, :4], img.shape).round()

                for *xyxy, conf, cls in reversed(det):
                    c = int(cls)
                    label = f'{
      
      names[c]}'
                    lbl = names[int(cls)]
                    cal_detect.append([label, xyxy, float(conf)])
        return cal_detect

    def select_device(self, device=''):
        pass

    def DetectMultiBackend(self, weights, device='', dnn=False, data=None):
        pass

    def letterbox(self, img, new_shape=(640, 640), stride=32, auto=True, scaleFill=False, scaleup=True, color=(114, 114, 114)):
        pass

    def non_max_suppression(self, pred, conf_thres=0.35, iou_thres=0.05, classes=None, agnostic_nms=False, max_det=1000):
        pass

    def scale_coords(self, img1_shape, coords, img0_shape, ratio_pad=None):
        pass


This program file is a program that uses the YOLOv5 model for target detection. It first imports some necessary libraries and modules, and then defines some functions and variables.

Inload_model function, it loads the YOLOv5 model and returns the model, step size, category name and other information.

Inrun function, it accepts an image as input and uses the loaded model for object detection. It preprocesses the image and then passes it to the model for inference. Finally, it post-processes the detected objects and returns the object's category, coordinates, and confidence.

In the main program, it first loads the model, then reads an image, and calls the run function for target detection. Finally, it visualizes the detection results on the image and displays them.

This program file uses many libraries and modules, including argparse, platform, shutil, time, numpy, cv2, torch, PyQt5, etc. It also uses the YOLOv5 model and some auxiliary functions and tool classes to implement the target detection function.

5.3 torch_utils.py

@contextmanager
def torch_distributed_zero_first(local_rank: int):
    """
    Decorator to make all processes in distributed training wait for each local_master to do something.
    """
    if local_rank not in [-1, 0]:
        dist.barrier(device_ids=[local_rank])
    yield
    if local_rank == 0:
        dist.barrier(device_ids=[0])


def date_modified(path=__file__):
    # return human-readable file modification date, i.e. '2021-3-26'
    t = datetime.datetime.fromtimestamp(Path(path).stat().st_mtime)
    return f'{
      
      t.year}-{
      
      t.month}-{
      
      t.day}'


def git_describe(path=Path(__file

这个程序文件是一个PyTorch工具文件,包含了一些常用的函数和类。文件名为torch_utils.py。

该文件中的函数和类的功能如下:

1. `torch_distributed_zero_first(local_rank: int)`:用于在分布式训练中,使所有进程等待每个本地主进程执行某个操作。

2. `date_modified(path=__file__)`:返回文件的人类可读的修改日期。

3. `git_describe(path=Path(__file__).parent)`:返回人类可读的git描述。

4. `select_device(device='', batch_size=None)`:选择设备(CPU或GPU)进行训练。

5. `time_sync()`:返回PyTorch准确的时间。

6. `profile(input, ops, n=10, device=None)`:用于对YOLOv5模型的速度、内存和FLOPs进行分析。
5.4 train.py

class YOLOTrainer:
    def __init__(self, hyp, opt, device, tb_writer=None):
        self.hyp = hyp
        self.opt = opt
        self.device = device
        self.tb_writer = tb_writer
        self.logger = logging.getLogger(__name__)
        
    def train(self):
        logger.info(colorstr('hyperparameters: ') + ', '.join(f'{
      
      k}={
      
      v}' for k, v in self.hyp.items()))
        save_dir, epochs, batch_size, total_batch_size, weights, rank, freeze = \
            Path(self.opt.save_dir), self.opt.epochs, self.opt.batch_size, self.opt.total_batch_size, self.opt.weights, self.opt.global_rank, self.opt.freeze

        # Directories
        wdir = save_dir / 'weights'
        wdir.mkdir(parents=True, exist_ok=True)  # make dir
        last = wdir / 'last.pt'
        best = wdir / 'best.pt'
        results_file = save_dir / 'results.txt'

        # Save run settings
        with open(save_dir / 'hyp.yaml', 'w') as f:
            yaml.dump(self.hyp, f, sort_keys=False)
        with open(save_dir / 'opt.yaml', 'w') as f:
            yaml.dump(vars(self.opt), f, sort_keys=False)

        # Configure
        plots = not self.opt.evolve  # create plots
        cuda = self.device.type != 'cpu'
        init_seeds(2 + rank)
        with open(self.opt.data) as f:
            data_dict = yaml.load(f, Loader=yaml.SafeLoader)  # data dict
        is_coco = self.opt.data.endswith('coco.yaml')

        # Logging- Doing this before checking the dataset. Might update data_dict
        loggers = {
    
    'wandb': None}  # loggers dict
        if rank in [-1, 0]:
            self.opt.hyp = self.hyp  # add hyperparameters
            run_id = torch.load(weights, map_location=self.device).get('wandb_id') if weights.endswith('.pt') and os.path.isfile(weights) else None
            wandb_logger = WandbLogger(self.opt, Path(self.opt.save_dir).stem, run_id, data_dict)
            loggers['wandb'] = wandb_logger.wandb
            data_dict = wandb_logger.data_dict
            if wandb_logger.wandb:
                weights, epochs, self.hyp = self.opt.weights, self.opt.epochs, self.opt.hyp  # WandbLogger might update weights, epochs if resuming

        nc = 1 if self.opt.single_cls else int(data_dict['nc'])  # number of classes
        names = ['item'] if self.opt.single_cls and len(data_dict['names']) != 1 else data_dict['names']  # class names
        assert len(names) == nc, '%g names found for nc=%g dataset in %s' % (len(names), nc, self.opt.data)  # check

        # Model
        pretrained = weights.endswith('.pt')
        if pretrained:
            with torch_distributed_zero_first(rank):
                attempt_download(weights)  # download if not found locally
            ckpt = torch.load(weights, map_location=self.device)  # load checkpoint
            model = Model(self.opt.cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=self.hyp.get('anchors')).to(self.device)  # create
            exclude = ['anchor'] if (self.opt.cfg or self.hyp.get('anchors')) and not self.opt.resume else []  # exclude keys
            state_dict = ckpt['model'].float().state_dict()  # to FP32
            state_dict = intersect_dicts(state_dict, model.state_dict(), exclude=exclude)  # intersect
            model.load_state_dict(state_dict, strict=False)  # load
            logger.info('Transferred %g/%g items from %s' % (len(state_dict), len(model.state_dict()), weights))  # report
        else:
            model = Model(self.opt.cfg, ch=3, nc=nc, anchors=self.hyp.get('anchors')).to(self.device)  # create
        with torch_distributed_zero_first(rank):
            check_dataset(data_dict)  # check
        train_path = data_dict['train']
        test_path = data_dict['val']

        # Freeze
        freeze = [f'model.{
      
      x}.' for x in (freeze if len(freeze) > 1 else range(freeze[0]))]  # parameter names to freeze (full or partial)
        for k, v in model.named_parameters():
            v.requires_grad = True  # train all layers
            if any(x in k for x in freeze):
                print('freezing %s' % k)
                v.requires_grad = False

        # Optimizer
        nbs = 64  # nominal batch size
        accumulate = max(round(nbs / total_batch_size), 1)  # accumulate loss before optimizing
        self.hyp['weight_decay'] *= total_batch_size * accumulate / nbs  # scale weight_decay
        logger.info(f"Scaled weight_decay = {
      
      self.hyp['weight_decay']}")

        pg0, pg1, pg2 = [], [], []  # optimizer parameter groups
        for k, v in model.named_modules():
            if hasattr(v, 'bias') and isinstance(v.bias, nn.Parameter):
                pg2.append(v.bias)  # biases
            if isinstance(v, nn.BatchNorm2d):
                pg0.append(v.weight)  # no decay
            elif hasattr(v, 'weight') and isinstance(v.weight, nn.Parameter):
                pg1.append(v.weight)  # apply decay
            if hasattr(v, 'im'):
                if hasattr(v.im, 'implicit'):           
                    pg0.append(v.im.implicit)
                else:
                    for iv in v.im:
                        pg0.append(iv.implicit)
            if hasattr(v, 'imc'):
                if hasattr(v.imc, 'implicit'):           
                    pg0.append(v.imc.implicit)
                else:
                    for iv in v.imc:
                        pg0.append(iv.implicit)
            if hasattr(v, 'imb'):
                if hasattr

This program file is a script used to train the model. It includes importing necessary libraries and modules, defining training functionstrain(), and some auxiliary functions and tool functions.

In thetrain() function, some configuration parameters are first read, such as hyperparameters, saving path, number of training rounds, etc. Then the model was created and the pre-trained weights were loaded. Then the optimizer and learning rate scheduler are configured, and the loss function is defined. Finally, the training cycle begins, and operations such as forward propagation, loss calculation, back propagation, and parameter update are performed in each epoch.

The entire training process also includes some auxiliary functions, such as logging, model saving, model evaluation, etc.

In general, this program file implements a complete model training process, including model initialization, data loading, optimizer configuration, definition of loss function, implementation of training loop, etc.

6. Overall structure of the system

Based on the above analysis, overall, this program is a project that uses the YOLOv5 model for target detection. It contains multiple files, each of which has different functions and is used to implement different modules and functions.

The following is a summary of the functions of each file:

file path Functional Overview
F:\project6\200deng(f13lw)h\code\Interface.py Load the YOLOv5 model and perform target detection
F:\project6\200deng(f13lw)h\code\main.py The main program calls the model for target detection
F:\project6\200deng(f13lw)h\code\torch_utils.py Contains some PyTorch tool functions and classes
F:\project6\200deng(f13lw)h\code\train.py Script to train the model
F:\project6\200deng(f13lw)h\code\models\common.py Contains some commonly used modules and functions
F:\project6\200deng(f13lw)h\code\models\experimental.py Contains some experimental models and features
F:\project6\200deng(f13lw)h\code\models\tf.py Contains some TensorFlow-related models and functions
F:\project6\200deng(f13lw)h\code\models\yolo.py Contains the definition and related functions of the YOLOv5 model
F:\project6\200deng(f13lw)h\code\models_init_.py Model initialization file
F:\project6\200deng(f13lw)h\code\tools\activations.py Contains definitions of some activation functions
F:\project6\200deng(f13lw)h\code\tools\augmentations.py Contains some data augmentation functions
F:\project6\200deng(f13lw)h\code\tools\autoanchor.py Contains functions for automatic anchor box generation
F:\project6\200deng(f13lw)h\code\tools\autobatch.py Contains functions for automatic batch resizing
F:\project6\200deng(f13lw)h\code\tools\callbacks.py Contains definitions of some callback functions
F:\project6\200deng(f13lw)h\code\tools\datasets.py Contains functions and classes for data set processing
F:\project6\200deng(f13lw)h\code\tools\downloads.py Contains functions for downloading datasets and weights
F:\project6\200deng(f13lw)h\code\tools\general.py Contains some common helper functions
F:\project6\200deng(f13lw)h\code\tools\loss.py Contains definitions of some loss functions
F:\project6\200deng(f13lw)h\code\tools\metrics.py Contains definitions of some evaluation metrics
F:\project6\200deng(f13lw)h\code\tools\plots.py Contains definitions of drawing functions
F:\project6\200deng(f13lw)h\code\tools\torch_utils.py Contains some PyTorch tool functions and classes
F:\project6\200deng(f13lw)h\code\tools_init_.py Tool initialization file
F:\project6\200deng(f13lw)h\code\tools\aws\resume.py Contains functions for AWS training recovery
F:\project6\200deng(f13lw)h\code\tools\aws_init_.py AWS tool initialization file
F:\project6\200deng(f13lw)h\code\tools\flask_rest_api\example_request.py Contains functions for Flask REST API sample requests
F:\project6\200deng(f13lw)h\code\tools\flask_rest_api\restapi.py Contains implementation of Flask REST API
F:\project6\200deng(f13lw)h\code\tools\loggers_init_.py Logger initialization file
F:\project6\200deng(f13lw)h\code\tools\loggers\wandb\log_dataset.py Contains functions for recording data sets using WandB
F:\project6\200deng(f13lw)h\code\tools\loggers\wandb\sweep.py Contains functions for hyperparameter search using WandB
F:\project6\200deng(f13lw)h\code\tools\loggers\wandb\wandb_utils.py Contains helper functions related to WandB
F:\project6\200deng(f13lw)h\code\tools\loggers\wandb_init_.py WandB logger initialization file
F:\project6\200deng(f13lw)h\code\utils\activations.py Contains definitions of some activation functions
F:\project6\200deng(f13lw)h\code\utils\augmentations.py Contains some data augmentation functions
F:\project6\200deng(f13lw)h\code\utils\autoanchor.py Contains functions for automatic anchor box generation
F:\project6\200deng(f13lw)h\code\utils\autobatch.py Contains functions for automatic batch resizing
F:\project6\200deng(f13lw)h\code\utils\callbacks.py Contains definitions of some callback functions
F:\project6\200deng(f13lw)h\code\utils\datasets.py Contains functions and classes for data set processing
F:\project6\200deng(f13lw)h\code\utils\downloads.py Contains functions for downloading datasets and weights
F:\project6\200deng(f13lw)h\code\utils\general.py Contains some common helper functions
F:\project6\200deng(f13lw)h\code\utils\loss.py Contains definitions of some loss functions
F:\project6\200deng(f13lw)h\code\utils\metrics.py Contains definitions of some evaluation metrics
F:\project6\200deng(f13lw)h\code\utils\plots.py Contains definitions of drawing functions
F:\project6\200deng(f13lw)h\code\utils\torch_utils.py Contains some PyTorch tool functions and classes
F:\project6\200deng(f13lw)h\code\utils_init_.py Tool initialization file
F:\project6\200deng(f13lw)h\code\utils\aws\resume.py Contains functions for AWS training recovery
F:\project6\200deng(f13lw)h\code\utils\aws_init_.py AWS tool initialization file
F:\project6\200deng(f13lw)h\code\utils\flask_rest_api\example_request.py Contains functions for Flask REST API sample requests

7. Improve YOLO-BIFPN

image.png
The blue part in the figure is the top-down path, which conveys the semantic information of high-level features; the red part is the bottom-up path, which conveys the position information of low-level features; the purple part is the same layer A new edge added between input node and input node.
. We delete nodes that have only one input edge. The idea behind this is simple: if a node has only one input edge and no feature fusion, then its contribution to the feature network designed to fuse different features will be very small. Deleting it has little impact on our network and simplifies the two-way network; as shown in the first node on the right of P7 in d above
·If the original input node and the output node are in the same layer, we will Adds an extra edge between the original input node and the output node. Idea: To integrate more features without increasing too much cost;
. Unlike PANet, which has only one top-down and one bottom-up path, we process each bidirectional path (top-down and bottom-up) path as a feature network layer and repeat the same layer multiple times to achieve Higher level feature fusion. As shown in the network structure of EificientNet in the figure below, we reuse BiFPN multiple times. And this usage count is not something we think is set, but is added as a parameter to the network design and calculated using NAS technology.

# 结合BiFPN 设置可学习参数 学习不同分支的权重
# 两个分支concat操作
......
class BiFPN_Concat2(nn.Module):
    def __init__(self, dimension=1):
        super(BiFPN_Concat2, self).__init__()
        self.d = dimension
        self.w = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True)
        self.epsilon = 0.0001
        # 设置可学习参数 nn.Parameter的作用是:将一个不可训练的类型Tensor转换成可以训练的类型 
        parameter
        # 并且会向宿主模型注册该参数 成为其一部分 即model.parameters()会包含这个parameter
        # 从而在参数优化的时候可以自动一起优化
 
    def forward(self, x):
        w = self.w
        weight = w / (torch.sum(w, dim=0) + self.epsilon)  # 将权重进行归一化
        # Fast normalized fusion
        x = [weight[0] * x[0], weight[1] * x[1]]
        return torch.cat(x, self.d)
 
 
# 三个分支concat操作
class BiFPN_Concat3(nn.Module):
    def __init__(self, dimension=1):
        super(BiFPN_Concat3, self).__init__()
        self.d = dimension
        self.w = nn.Parameter(torch.ones(3, dtype=torch.float32), requires_grad=True)
        self.epsilon = 0.0001
 
    def forward(self, x):
        w = self.w
        weight = w / (torch.sum(w, dim=0) + self.epsilon)  # 将权重进行归一化
        # Fast normalized fusion
        x = [weight[0] * x[0], weight[1] * x[1], weight[2] * x[2]]
        return torch.cat(x, self.d)
......

8. System integration

Below is the complete source code & environment deployment video tutorial & data set & custom UI interface
1.png

Reference Blog "Improved YOLO and OpenCV Traffic Light Recognition System (Deployment Tutorial & Source Code)"

Guess you like

Origin blog.csdn.net/cheng2333333/article/details/135011513