Automatic license plate recognition based on deep learning (detailed steps + source code)

Click the card below to follow the public account of " OpenCV and AI Deep Learning "!

 

Visual/image heavy dry goods, delivered as soon as possible!

Source | Learn OpenCV

Author | Sanyam

Translation | OpenCV and AI Deep Learning

Guided reading

This article will focus on the end-to-end implementation of ALPR. It will focus on two processes: license plate detection and OCR of detected license plates. (Public Number: OpenCV and AI Deep Learning)

 Background introduction

    Deep learning has been one of the fastest growing technologies in the modern world. Deep learning has become a part of our daily lives, and it is everywhere from voice assistants to autonomous cars. One such application is Automatic License Plate Recognition (ALPR). As the name suggests, ALPR is a technology that uses the power of artificial intelligence and deep learning to automatically detect and recognize vehicle license plate characters.

    This article will focus on the end-to-end implementation of ALPR. It will focus on two processes, [1] license plate detection, [2] OCR of detected license plates. 

 Introduction to ALPR

    Imagine a beautiful summer, you're driving on the highway, your favorite song is playing on the radio, you cross the speed limit, and you're driving through a 70km/h zone at 90km/h for a few camera, then realize your mistake but it's too late. After a few weeks, you will receive a ticket with evidence of your car's image. You must be wondering, do they manually check every picture and send a ticket?

    Of course not, that was sent by the ALPR system. From captured images or footage, ALPR detects and extracts your license plate number and sends you a ticket. It's all based on a simple ALPR system and a few lines of code.

    Automatic License Plate Recognition (ALPR) or ANPR is the technology responsible for reading vehicle license plates in images or video sequences using optical character recognition. With recent advances in deep learning and computer vision, these tasks can be accomplished in milliseconds.

 How ALPR Works

    ALPR is one of the widely used computer vision applications. It utilizes various methods such as object detection, OCR, image segmentation, etc. For hardware, an ALPR system only needs a camera and a good GPU. For simplicity, this blog post will focus on a two-step process.

[1] Detection: First, an image or frame of a video sequence is passed from a camera or a stored file to a detection algorithm, which detects a license plate and returns the bounding box location for that license plate.

[2] Recognition: Apply OCR to the detected license plate, recognize the characters of the license plate, and return the characters in the same order in text format. The output can be stored in a database or plotted on an image for visualization. 

Let's go through each step in detail.

 Use YOLO V4 to detect license plates

    This pipeline module is responsible for detecting license plates from images or frames of a video sequence. 

    The detection process can be done using any detector, whether it is an area-based detector or a one-shot detector. This blog post will focus on YOLO v4's one-shot detector, mainly because of its good speed and accuracy tradeoff and its ability to better detect small objects. YOLOv4 will be implemented using the Darknet framework.

Darknet

    Darknet is an open source neural network framework written in C and CUDA. YOLOv4 uses CSPDarknet53 CNN, which means its object detection backbone uses Darknet53, with a total of 53 convolutional layers. Darknet is very easy to install, use, and can be done with just a few lines of code.

git clone https://github.com/AlexeyAB/darknet

    Darknet will be installed and compiled, and some parameters will be set according to the needs of the environment.

%cd darknetsed -i 's/OPENCV=0/OPENCV=1/' Makefilesed -i 's/GPU=0/GPU=1/' Makefilesed -i 's/CUDNN=0/CUDNN=1/' Makefilesed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefilesed -i 's/LIBSO=0/LIBSO=1/' Makefile

    Congratulations! Darknet is now installed.

    Here, some parameters (like OpenCV, GPU, CUDA, etc.) are set to 1, i.e. set to True, because they are necessary to improve code efficiency and run computations faster.

data set

    Data is at the heart of any AI application, and one of the first and most important steps. To train the YOLOv4 detector, Google's Vehicle Open Image dataset will be used. Google's "Open Images" is an open-source dataset containing thousands of annotated object images for object detection, segmentation, and more. The dataset contains 1500 training images and 300 validation images in YOLO format. The dataset can be downloaded from here and placed under a folder named data. Let's take a look at the dataset.

import math# Creating a list of image files of the dataset.data_path = './data/obj/train/'files = os.listdir(data_path)img_arr = []
# Displaying 4 images only.num = 4
# Appending the array of images to a list.for fimg in files:    if fimg.endswith('.jpg'):      demo = img.imread(data_path+fimg)      img_arr.append(demo)      if len(img_arr) == num:        break
# Plotting the images using matplotlib._, axs = plt.subplots(math.floor(num/2), math.ceil(num/2), figsize=(50, 28))
axs = axs.flatten()
for cent, ax in zip(img_arr, axs):    ax.imshow(cent)plt.show()

Training

    In order for the model to learn, it needs to be trained on the dataset. Before starting the training process, the configuration file (.cfg) needs to be modified. The parameters that need to be modified are batch size, subdivision, class, etc. Download the configuration file from here .

    Now that the data is in place and the configuration is complete, how will the model access the data? Two files are created, one of which contains information for training data, test data and class information. Let's call it obj.data (which can be downloaded from here ) and another is obj.names which contains the names of all the classes. You can download obj.names from here .

    The next step is to download the pretrained weights for YOLOv4.

wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137

Now comes the big part of training!

./darknet detector train data/obj.data cfg/yolov4-obj.cfg yolov4.conv.137 -dont_show -map

    Parameters include the obj.data file, configuration file, and yolov4 pretrained weights, as described earlier.

  • -dont_show is passed when we don't want to show the output. Also, you need to pass this when running the code in google colab notebook, since it doesn't support GUI output, not passing it will cause an error.

  • -map is passed to compute the predicted mAP after every few iterations.

    Let's wait a few hours, hooray! The model is now trained. If you want to skip the training process, you can also download trained or our fine-tuned models from here.

evaluate

    It is important to judge how well a trained model performs on unseen data. This is a great way to know if the model is performing well or is overfitting. For object detection tasks, one of the metrics is mean precision, or mAP for short. In an advanced explanation, the predicted bounding box is compared to the detected bounding box and a score called mAP is returned.

    This code automatically saves the training progress graph, which is how our model performed, achieving 90% mAP after 3000 epochs in 5.3 hours.

reasoning

    Now the license plate detector is fully trained. Time to use it. To do this, we will create a function called yolo_det(). This function is responsible for detecting the bounding box of the license plate from the input vehicle image.

def yolo_det(frame, config_file, data_file, batch_size, weights, threshold, output, network, class_names, class_colors, save = False, out_path = ''):
  prev_time = time.time()    # Preprocessing the input image.  width = darknet.network_width(network)  height = darknet.network_height(network)  darknet_image = darknet.make_image(width, height, 3)  image_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)  image_resized = cv2.resize(image_rgb, (width, height))    # Passing the image to the detector and store the detections  darknet.copy_image_from_bytes(darknet_image, image_resized.tobytes())  detections = darknet.detect_image(network, class_names, darknet_image, thresh=threshold)  darknet.free_image(darknet_image)
  # Plotting the deetections using darknet in-built functions  image = darknet.draw_boxes(detections, image_resized, class_colors)  print(detections)  if save:    im = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)    file_name = out_path + '-det.jpg'    cv2.imwrite(os.path.join(output, file_name), im)
  # Calculating time taken and FPS for detection  det_time = time.time() - prev_time  fps = int(1/(time.time() - prev_time))  print("Detection time: {}".format(det_time))    # Resizing predicted bounding box from 416x416 to input image resolution  out_size = frame.shape[:2]  in_size = image_resized.shape[:2]  coord, scores = resize_bbox(detections, out_size, in_size)  return coord, scores, det_time

 License plate text recognition

    Now that we've trained our custom license plate detector, it's time to move on to the second step of ALPR, text recognition.

Text recognition is the process of identifying text from a scene by understanding and analyzing its underlying patterns. It is also known as Optical Character Recognition or OCR. It can also be used for various applications such as document reading, information retrieval, shelf product identification, and more. OCR can be trained or used as a pretrained model. In this article, a pretrained OCR model will be used.

PaddleOCR

    PaddleOCR is one such framework or toolkit for OCR. PaddleOCR provides users with a multilingual practical OCR tool that helps users apply and train different models in a few lines of code. PaddleOCR provides many models in its toolkit, including PP-OCR, a series of high-quality pre-trained OCR, state-of-the-art algorithms such as SRN, and popular OCR algorithms such as CRNN. 

    PaddleOCR also provides different types of models, either lightweight (models that take up less memory) or heavyweight (models that take a lot of memory), as well as freely available pretrained weights. 

OCR comparison

    As mentioned in the previous section, PaddleOCR provides a variety of models, and it is always a good practice to compare which model performs well in terms of accuracy and speed.

    The models were tested on the IC15 dataset, an accompanying scene text dataset containing only English words. It contains 1000 training images, but it is tested on random 500 of them. The model is tested using a string similarity measure called Levenshtein distance. Levenshtein distance is the change required to implement one string in another. The smaller the distance, the better the model. Three models are tested on the IC15 dataset using the Levenshtein distance on a Tesla K80 GPU.

The focus will be on the lightweight PPOCRv2 (11.6M). It strikes a good balance between speed, accuracy, and is very lightweight (i.e. takes up very little memory). It also provides support for English and Chinese. See here for the OCR comparison code .

OCR implementation

    Now, it's time to implement the selected OCR model. PaddleOCR will be implemented in a few lines of code and will work wonders for our ALPR system.

    First, let's install the required toolkits and dependencies. These dependencies and tools will help us access all the files and scripts needed for OCR implementation. ​​​​​​​

pip install paddlepaddle-gpupip install "paddleocr>=2.0.1"

    After installation, OCR needs to be initialized according to our requirements. ​​​​​​​

from paddleocr import PaddleOCRocr = PaddleOCR(lang='en',rec_algorithm='CRNN')

    Using PaddleOCR we initialize OCR, it takes several parameters, they are:

    • lang – specifies the language to recognize

    • det_algorithm – specifies the text detection algorithm to use 

    • Rec_algorithm – specifies the recognition algorithm to use

    For ALPR, only two parameters are passed, the language and the recognition algorithm. Here, we use lang for English and the CRNN recognition algorithm, also known as PPOCRv2 in this toolkit.

    This OCR can be used with just one line of code.

result = ocr.ocr(cr_img, cls=False, det=False)

    Here, cr_img is the image cls passed to OCR, and det is a parameter set to false, since the text detector and text angle classifier are not needed in our ALPR pipeline.

reasoning

    Now that the license plate detector is fully trained, OCR is ready. Time to put it all together and put it to work. To do this, we will create some helper functions to access all functions at once.

    First, we'll create a function that takes care of cropping the image by taking the image and coordinates as arguments, we'll call it crop(). ​​​​​​​

def crop(image, coord):  # Cropping is done by -> image[y1:y2, x1:x2].  cr_img = image[coord[1]:coord[3], coord[0]:coord[2]]  return cr_img

Image test

    To perform ANPR on an image, we will create a final function like test_img() to perform detection, cropping, OCR and output plotting in one place.

    Before that, we'll initialize some variables that will be helpful throughout this blog post. ​​​​​​​

# Variables storing colors and fonts.font = cv2.FONT_HERSHEY_SIMPLEXblue_color = (255,0,0)white_color = (255,255,255)black_color = (0,0,0)green_color = (0,255,0)yellow_color = (178, 247, 218)​​​​​​
def test_img(input, config_file, weights, out_path):  # Loading darknet network and classes along with the bbox colors.  network, class_names, class_colors = darknet.load_network(            config_file,            data_file,            weights,            batch_size= batch_size        )    # Reading the image and performing YOLOv4 detection.   img = cv2.imread(input)  bboxes, scores, det_time = yolo_det(img, config_file, data_file, batch_size, weights, thresh, out_path, network, class_names, class_colors)
  # Extracting or cropping the license plate and applying the OCR.  for bbox in bboxes:    cr_img = crop(img, bbox)    result = ocr.ocr(cr_img, cls=False, det=False)    ocr_res = result[0][0]    rec_conf = result[0][1]
    # Plotting the predictions using OpenCV.    (label_width,label_height), baseline = cv2.getTextSize(ocr_res , font, 2, 3)    top_left = tuple(map(int,[int(bbox[0]),int(bbox[1])-(label_height+baseline)]))    top_right = tuple(map(int,[int(bbox[0])+label_width,int(bbox[1])]))    org = tuple(map(int,[int(bbox[0]),int(bbox[1])-baseline]))
    cv2.rectangle(img, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), blue_color, 2)    cv2.rectangle(img, top_left, top_right, blue_color,-1)    cv2.putText(img, ocr_res, org, font, 2, white_color,3)
  # Writing output image.  file_name = os.path.join(out_path, 'out_' + input.split('/')[-1])  cv2.imwrite(file_name, img)

Congratulations! ! The pipeline to run ALPR on the image was successfully created. Let's try it on a random image.

First, we'll import some libraries and the functions and methods needed to apply ANPR. ​​​​​​​

# Importing libraries and required functionalities.# DeepSORT imports.%cd ./deep_sortfrom application_util import preprocessingfrom deep_sort import nn_matchingfrom deep_sort.detection import Detectionfrom deep_sort.tracker import Trackerfrom tools_deepsort import generate_detections as gdetimport uuid
# Required libraries.import osimport globimport randomimport timeimport cv2import numpy as npimport darknetimport subprocessimport sysfrom PIL import Imageimport matplotlibimport matplotlib.pyplot as plt%matplotlib inline

# Darknet object detector imports.%cd ./darknetfrom darknet_images import load_imagesfrom darknet_images import image_detection

​​​​​​​

# Declaring important variables.# Path of Configuration file of YOLOv4.config_file = './darknet/cfg/yolov4-obj.cfg'# Path of obj.data file.data_file = './darknet/data/obj.data'# Batch size of data passed to the detector.batch_size = 1# Path to trained YOLOv4 weights.weights = './checkpoint/yolov4-obj_best.weights'# Confidence threshold.thresh = 0.6​​​​​​​
# Calling the function.input_dir = 'car-img.jpg'out_path = '/content/'test_img(input_dir, config_file, weights,out_path)

    We will now display the final output.

​​​​​​​

out_img = cv2.imread('./out_car-img.jpg')cv2.imshow(out_img)

Uploading...Reupload canceled

Video test

    After we tested our ALPR on images, we can similarly apply it to videos. For video, we just apply the ALPR pipeline frame by frame in a similar way to images. Let's dive into it. ​​​​​​​

def test_vid(vid_dir, config_file, weights,out_path):  # Declaring variables for video processing.  cap = cv2.VideoCapture(vid_dir)  codec = cv2.VideoWriter_fourcc(*'XVID')  width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))  height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  fps = int(cap.get(cv2.CAP_PROP_FPS))  file_name = os.path.join(out_path, 'out_' + vid_dir.split('/')[-1])  out = cv2.VideoWriter(file_name, codec, fps, (width, height))    # Frame count variable.  ct = 0    # Loading darknet network and classes along with the bbox colors.  network, class_names, class_colors = darknet.load_network(          config_file,          data_file,          weights,          batch_size= batch_size      )    # Reading video frame by frame.  while(cap.isOpened()):    ret, img = cap.read()    if ret == True:        print(ct)
        # Noting time for calculating FPS.        prev_time = time.time()
        # Performing the YOLOv4 detection.        bboxes, scores, det_time = yolo_det(img, config_file, data_file, batch_size, weights, thresh, out_path, network, class_names, class_colors)                # Extracting or cropping the license plate and applying the OCR.        if list(bboxes):          for bbox in bboxes:            cr_img, cord = crop(img, bbox)                        result = ocr.ocr(cr_img, cls=False, det=False)
            ocr_res = result[0][0]            rec_conf = result[0][1]
            # Plotting the predictions using OpenCV.            txt = ocr_res            (label_width,label_height), baseline = cv2.getTextSize(ocr_res , font,2,3)            top_left = tuple(map(int,[int(bbox[0]),int(bbox[1])-(label_height+baseline)]))            top_right = tuple(map(int,[int(bbox[0])+label_width,int(bbox[1])]))            org = tuple(map(int,[int(bbox[0]),int(bbox[1])-baseline]))
            cv2.rectangle(img, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), blue_color, 2)            cv2.rectangle(img, top_left, top_right, blue_color, -1)            cv2.putText(overlay_img,txt, org, font, 2, white_color, 3)            #cv2.imwrite('/content/{}.jpg'.format(ct), img)
          # Calculating time taken and FPS for the whole process.          tot_time = time.time() - prev_time          fps = 1/tot_time                    # Writing information onto the frame and saving it to be processed in a video.          cv2.putText(img, 'frame: %d fps: %s' % (ct, fps),                  (0, int(100 * 1)), cv2.FONT_HERSHEY_PLAIN, 5, (0, 0, 255), thickness=2)          out.write(img)                ct = ct + 1    else:      break

Time to try it out on random videos. You can download it from here . ​​​​​​​

# Calling the function.input_dir = './Pexels Videos 2103099.mp4'out_path = '/content/'test_vid(input_dir, config_file, weights,out_path)

Show output (for jupyter notebooks or colab). The output can be seen here . ​​​​​​​

from IPython.display import HTMLfrom base64 import b64encode
# Input video path.save_path = './out_Pexels Videos 2103099.mp4'
# Compressed video path.compressed_path = "./compressed.mp4"
#compressing the size of video to avoid crashing.os.system(f"ffmpeg -i {save_path} -vcodec libx264 {compressed_path}")
# Show video.mp4 = open(compressed_path,'rb').read()data_url = "data:video/mp4;base64," + b64encode(mp4).decode()HTML("""<video width=400 controls>      <source src="%s" type="video/mp4"></video>""" % data_url)

Tracker integration

As you must have seen in the previous section, the video output is not very accurate and has many problems.

    • jitter

    • Fluctuations in OCR output

    • detection loss

To address this problem, this section proposes a solution to integrate the tracker with the ALPR system. But how does the use of trackers solve these problems? let's see.

The role of trackers in ALPR 

    As mentioned earlier, when running ALPR on video, there are a few issues that cause ALPR to be less accurate. But these problems can be corrected if trackers are used. Trackers are generally used for the following reasons:

    • Works when object detection fails

    • Assign ID

    • trace path

    All the problems that ALPR faces, the tracker is only used because of these problems. The tracker will be used to obtain the best OCR result for a specific detected license plate.

    Once the tracker is implemented, it returns the coordinates and id of the bounding box, OCR will be applied to each bounding box, and the output will be stored with the id. To reduce the fluctuation problem of OCR output, all bounding boxes with the same id up to the current frame are collected, and the bounding box with the highest OCR confidence is reserved and displayed for that id. When implemented, the process will be clearer.

Implementation of the tracker

For this, let's create a new helper function get_best_ocr() to implement the logic discussed in the previous section. ​​​​​​​

def get_best_ocr(preds, rec_conf, ocr_res, track_id):  for info in preds:    # Check if it is the current track id.    if info['track_id'] == track_id:      # Check if the ocr confidence is highest or not.      if info['ocr_conf'] < rec_conf:        info['ocr_conf'] = rec_conf        info['ocr_txt'] = ocr_res      else:        rec_conf = info['ocr_conf']        ocr_res = info['ocr_txt']      break  return preds, rec_conf, ocr_res

Finally, we'll look at the next function that runs ALPR on video and is called tracker_test_vid(). It will be just like test_vid() , using the tracker implemented with it. This blog post will focus on using DeepSORT as a tracker because it is lightweight and easy to use, and it also provides appearance descriptors with just a few lines of code. We will use a pretrained deep association metric model called mars-small128.pb, which can be downloaded from here . ​​​​​​​

def tracker_test_vid(vid_dir, config_file, weights,out_path):  # Declaring variables for video processing.  cap = cv2.VideoCapture(vid_dir)  codec = cv2.VideoWriter_fourcc(*'XVID')  width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))  height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  fps = int(cap.get(cv2.CAP_PROP_FPS))  file_name = os.path.join(out_path, 'out_' + vid_dir.split('/')[-1])
  out = cv2.VideoWriter(file_name, codec, fps, (width, height))
  # Declaring variables for tracker.  max_cosine_distance = 0.4  nn_budget = None    # Initializing tracker  model_filename = './model_data/mars-small128.pb'  encoder = gdet.create_box_encoder(model_filename, batch_size=1)  metric = nn_matching.NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget)  tracker = Tracker(metric)    # Initializing some helper variables.  ct = 0  preds = []  total_obj = 0  rec_tot_time = 1  alpha = 0.5    # Loading darknet network and classes along with the bbox colors.  network, class_names, class_colors = darknet.load_network(          config_file,          data_file,          weights,          batch_size= batch_size      )    # Reading video frame by frame.  while(cap.isOpened()):    ret, img = cap.read()    if ret == True:
        h, w = img.shape[:2]        print(ct)                w_scale = w/1.55        h_scale = h/17        top_left = (int(w_scale) + 10 + label_width, int(h_scale))
        # Method to blend two images, here used to make the information box transparent.        overlay_img = img.copy()        cv2.rectangle(img, (w_scale, 0), (w, int(h_scale*3.4)), black_color, -1)        cv2.addWeighted(img, alpha, overlay_img, 1 - alpha, 0, overlay_img)
        # Noting time for calculating FPS.        prev_time = time.time()
        # Performing the YOLOv4 detection.        bboxes, scores, det_time = yolo_det(img, config_file, data_file, batch_size, weights, thresh, out_path, network, class_names, class_colors)                if list(bboxes):          # Getting appearance features of the object.          features = encoder(img, bboxes)          # Storing all the required info in a list.          detections = [Detection(bbox, score, feature) for bbox, score, feature in zip(bboxes, scores, features)]
          # Applying tracker.          # The tracker code flow: kalman filter -> target association(using hungarian algorithm) and appearance descriptor.          tracker.predict()          tracker.update(detections)          track_time = time.time() - prev_time                    # Checking if tracks exist.          for track in tracker.tracks:            if not track.is_confirmed() or track.time_since_update > 1:                continue
            # Changing track bbox to top left, bottom right coordinates            bbox = list(track.to_tlbr())                        for i in range(len(bbox)):              if bbox[i] < 0:                bbox[i] = 0
            # Extracting or cropping the license plate and applying the OCR.            cr_img = crop(img, bbox)                        rec_pre_time = time.time()            result = ocr.ocr(cr_img, cls=False, det=False)            rec_tot_time = time.time() - rec_pre_time
            ocr_res = result[0][0]            rec_conf = result[0][1]                        if rec_conf == 'nan':              rec_conf = 0
            # Storing the ocr output for corresponding track id.            output_frame = {"track_id":track.track_id, "ocr_txt":ocr_res, "ocr_conf":rec_conf}                        # Appending track_id to list only if it does not exist in the list.            if track.track_id not in list(set(ele['track_id'] for ele in preds)):              total_obj = total_obj + 1              preds.append(output_frame)            # Looking for the current track in the list and updating the highest confidence of it.            else:              preds, rec_conf, ocr_res = get_best_ocr(preds, rec_conf, ocr_res, track.track_id)              # Plotting the predictions using OpenCV.            txt = str(track.track_id) + '. ' + ocr_res            (label_width,label_height), baseline = cv2.getTextSize(ocr_res , font,2,3)            top_left = tuple(map(int,[int(bbox[0]),int(bbox[1])-(label_height+baseline)]))            top_right = tuple(map(int,[int(bbox[0])+label_width,int(bbox[1])]))            org = tuple(map(int,[int(bbox[0]),int(bbox[1])-baseline]))
            cv2.rectangle(img, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), blue_color, 2)            cv2.rectangle(img, top_left, top_right, blue_color, -1)            cv2.putText(overlay_img,txt, org, font, 2, white_color, 3)            #cv2.imwrite('/content/{}.jpg'.format(ct), img)
          # Calculating time taken and FPS for the whole process.          tot_time = time.time() - prev_time          fps = 1/tot_time                    # Writing information onto the frame and saving the frame to be processed into a video with title and values of different colors.          if w < 2000:            size = 1          else:            size = 2
          # Plotting frame count information on the frame.          (label_width,label_height), baseline = cv2.getTextSize('Frame count:' , font,size,2)          cv2.putText(overlay_img, 'Frame count:', top_left, font, size, green_color, thickness=2)          cv2.putText(overlay_img,'%d ' % (ct), top_left, font, size, yellow_color, thickness=2)
          (label_width,label_height), baseline = cv2.getTextSize('Frame count:' + ' ' + str(ct) , font, size,2)          cv2.putText(overlay_img, 'Total FPS:' , top_left, font, size, green_color, thickness=2)
          (label_width,label_height), baseline = cv2.getTextSize('Frame count:' + ' ' + str(ct) + 'Total FPS:' , font, size,2)          cv2.putText(overlay_img, '%s' % (int(fps)), top_left, font, size, yellow_color, thickness=2)
          # Plotting Total FPS of ANPR information on the frame.          cv2.putText(overlay_img, 'Detection FPS:' ,(top_left[0], int(h_scale*1.7)), font, size, green_color, thickness=2)          (label_width,label_height), baseline = cv2.getTextSize('Detection FPS:', font,size,2)          cv2.putText(overlay_img, '%d' % ((int(1/det_time))),(top_left[0], int(h_scale*1.7)), font, size, yellow_color, thickness=2)
          # Plotting Recognition/OCR FPS of ANPR on the frame.          cv2.putText(overlay_img, 'Recognition FPS:',(top_left[0], int(h_scale*2.42)), font, size, (green_color, thickness=2)          (label_width,label_height), baseline =                     cv2.getTextSize('Recognition FPS:', font,size,2)          cv2.putText(overlay_img, '%s' % ((int(1/rec_tot_time))),(top_left[0], int(h_scale*2.42)), font, size, yellow_color, thickness=2)          cv2.imwrite('/content/{}.jpg'.format(ct), overlay_img)          out.write(overlay_img)                # Increasing frame count.        ct = ct + 1    else:      break

Run it similarly to the previous section. ​​​​​​​

# Calling the function.input_dir = './Pexels Videos 2103099.mp4'out_path = '/content/'tracker_test_vid(input_dir, config_file, weights,out_path)

The output can be displayed as shown before. This is the final output , as it can clearly be seen that all the issues discussed are greatly reduced, ALPR seems to be fairly accurate and performs well at 14-15 FPS.

in conclusion

    In this blog post, we build an ALPR or ANPR system at 14 to 15 FPS. Here, we focus on a two-step process: i) license plate detector, ii) license plate detector extraction and OCR. 

    During this process, many questions may strike your brain, such as how to speed up? How to improve accuracy? How will the tracker respond to occlusion? etc. One way is to try to find out for yourself.

Here, license plates were trained with 90% accuracy. If speed is the main goal of a license plate detector, YOLO-tiny is preferred, which provides better speed than YOLOv4, but with a trade-off in accuracy. 

    Also, PaddleOCR's PP-OCR works flawlessly, it's lightweight and very accurate, with a good balance between accuracy and speed. PaddleOCR provides various models, such as SRN, heavyweight PPOCR, etc., which can be used or even trained from scratch to achieve desired results.

    But the ideal approach for our ALPR is to use a tracker, which maintains the best OCR results. Various other trackers like OpenCV tracker, CenterTrack, Tracktor etc. which solve different high level problems like occlusion, Re-id etc.

    Feel free to explore references, adjust your input, and find more ways to make tasks more challenging.

The code address of this article:

https://github.com/spmallick/learnopencv/tree/master/ALPR

refer to

YOLOv4: https://github.com/AlexeyAB/darknet

License plate dataset: 

https://storage.googleapis.com/openimages/web/index.html

PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Test video: 

https://www.pexels.com/video/traffic-flow-in-the-highway-2103099/

DeepSORT:https://github.com/nwojke/deep_sort

Download 1: Pytoch Common Function Manual

Reply in the background of the " OpenCV and AI Depth " number: Pytorch function manual , you can learn and download the first common Pytorch function manual on the whole network, including Tensors introduction, basic function introduction, data processing functions, optimization functions, CUDA programming, multiprocessing , etc. Ten The content of chapter four.

Download 2: 145 OpenCV example application code

Reply in the background of the " OpenCV and AI Deepening" official account: OpenCV145 , can learn and download 145 OpenCV example application codes ( Python and C++ dual language implementation ).

Guess you like

Origin blog.csdn.net/stq054188/article/details/123720323