foreword

There is a recent task of transmitting the results of object detection in video to the front end. This task is easy to achieve. According to reality, each frame of video is detected and then returned to the image stream for display on the front end. However, the above request does not return the video stream, but returns the detection result. When I heard this task, I was puzzled. In theory, only the data stream needs to be returned, but I felt that there was something strange about it, so I wrote this article to sort out the entire video stream and return it. This blog mainly refers to: Video Streaming with Flask and Flask Video Streaming Revisited . Please refer to: flask-video-streaming for the code .

Streaming

There are two main application scenarios for using streams in Flask:

When large response
returns a large data block, it is a better solution to use streams to generate and return it. Of course, you can also write the return response to disk, and then return a file flask.send_file(). But this situation will add additional I/O overhead.
Real-time data transmission
Real-time data transmission, such as video or voice transmission, can use streaming.

Flask implements streaming

Flask generator functionsprovides support for streaming responses through the use, one generator functionlooks like this:

def gen():
    yield 1
    yield 2
    yield 3

After a simple understanding of the generator above, the following example shows how to use streams to process and generate large data reports and return them:

from flask import Response, render_template
from app.models import Stock

def generate_stock_table():
    yield render_template('stock_header.html')
    for stock in Stock.query.all():
        yield render_template('stock_row.html', stock=stock)
    yield render_template('stock_footer.html')

@app.route('/stock-table')
def stock_table():
    return Response(generate_stock_table())

In this example, the response route that returns the stream needs to return an object initialized with a generator function Response, and then Flaskis responsible for calling the generator to send the result in chunks to the client. The advantage of this is that a large data block needs to be generated in the program, and through streaming, the response return request will not become larger as your block becomes larger.

In addition to being able to divide large chunks of data into chunks, streams can also provide Multipart Responses. The most important application scenario in this regard is the return playback of video streams or audio streams. One interesting use of streams in this is to have each chunk replace the previous chunk of the page, which enables the stream to "play" in the browser window. Multipart/ResponseConsists of a header containing one of the multi-part content types, followed by split sections of boundary tags, each with its own specific content type. Here is Multipartthe structure of the video stream:

HTTP/1.1 200 OK
Content-Type: multipart/x-mixed-replace; boundary=frame

--frame
Content-Type: image/jpeg

<jpeg data here>
--frame
Content-Type: image/jpeg

<jpeg data here>
...

As mentioned above, Content-Typethe setting of the header is multipart/x-mixed-replacewell defined bouondary. Then include each frame of data, prefixed --with , and add the boundary string on its own line, as well as Content-typethe header, each section can optionally contain one Content-Lengthstating the length in bytes of the payload.

After understanding the above basic knowledge, the next step is to build a real-time video streaming server. The principle is relatively simple, or each frame in the video is then streamed Multipart/Responseback to the client.

Build a live video stream

A simple FlaskWeb program that provides a Motion JPEG stream. Note that Motion JPEG is widely used. This method has low latency, but not the best quality, since JPEG compression is not very effective for motion video.
Get video frames from camera:

from time import time

class Camera(object):
    def __init__(self):
        self.frames = [open(f + '.jpg', 'rb').read() for f in ['1', '2', '3']]

    def get_frame(self):
        return self.frames[int(time()) % 3]

The above part of the code is an example, for debugging without a camera device, the image stream is constructed by reading the image under the project.

#!/usr/bin/env python
from flask import Flask, render_template, Response
from camera import Camera

app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

def gen(camera):
    while True:
        frame = camera.get_frame()
        yield (b'--frame\r\n'
               b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n')

@app.route('/video_feed')
def video_feed():
    return Response(gen(Camera()),
                    mimetype='multipart/x-mixed-replace; boundary=frame')

if __name__ == '__main__':
    app.run(host='0.0.0.0', debug=True)

This application defines a Cameraclass responsible for providing the sequence of frames. Front-end HTML content:

<html>
  <head>
    <title>Video Streaming Demonstration</title>
  </head>
  <body>
    <h1>Video Streaming Demonstration</h1>
    <img src="{
     
     { url_for('video_feed') }}">
  </body>
</html>

video_feedgenThe generator function is called in the route , which calls the ```Camera`` class to get the video stream. The whole process is relatively simple. However, there are some limitations to using streams. When a Flask application handles regular requests, the request cycle is very short. The Web Worker accepts the request, calls a handler function and finally returns a response to the client. When the client receives a stream, the client needs to keep connected during the streaming. On the other hand, when the client disconnects, the server may continue to provide services to the client, making it difficult to close the stream transmission. At the same time, this service can only be provided to clients with the same number of Web Workers. There are some ways to overcome the above problems, that is to use coroutines or multithreading. Next, let's see how to optimize the above program.

Video streaming optimization

There are two main problems in the video streaming program above, one is how to end the transmission data stream, and the other is how to provide services to multiple clients with a single service.
First of all, for the first question, the principle is to record the timestamp of the last response. If the difference between the timestamp of the last response and the current timestamp is greater than the threshold (it can be set to ten seconds, but it should not be too small, otherwise it will cause the request to fail normally) . Here is the optimized code:

Define Camerathe base class:

class BaseCamera(object):
    thread = None  # background thread that reads frames from camera
    frame = None  # current frame is stored here by background thread
    last_access = 0  # time of last client access to the camera
    # ...

    @staticmethod
    def frames():
        """Generator that returns frames from the camera."""
        raise RuntimeError('Must be implemented by subclasses.')

    @classmethod
    def _thread(cls):
        """Camera background thread."""
        print('Starting camera thread.')
        frames_iterator = cls.frames()
        for frame in frames_iterator:
            BaseCamera.frame = frame

            # if there hasn't been any clients asking for frames in
            # the last 10 seconds then stop the thread
            if time.time() - BaseCamera.last_access > 10:
                frames_iterator.close()
                print('Stopping camera thread due to inactivity.')
                break
        BaseCamera.thread = None

Inherited BaseCameraclasses Camera:

class Camera(BaseCamera):
    """An emulated camera implementation that streams a repeated sequence of
    files 1.jpg, 2.jpg and 3.jpg at a rate of one frame per second."""
    imgs = [open(f + '.jpg', 'rb').read() for f in ['1', '2', '3']]

    @staticmethod
    def frames():
        while True:
            time.sleep(1)
            yield Camera.imgs[int(time.time()) % 3]

Then, for the second question, multi-threading can be used to improve the performance of multi-client requests. On the other hand, it is found that the server consumes a lot of CPU during the test. The reason is that there is no synchronization between the background thread capturing frames and the generators that serve those frames to the client. Both run as fast as possible, regardless of the speed of the other.
So there needs to be a mechanism where the generator only delivers raw frames to the client, and if the delivery loop inside the generator is faster than the camera thread's frame rate, then the generator should wait until new frames are available so that it can adjust the speed itself to match the camera rate. On the other hand, if the delivery loop runs slower than the camera thread, it should never fall behind in processing frames, but should instead skip frames to always deliver the latest frame. The solution is to have the camera thread signal the running generator when a new frame is available. The generator can then block while waiting for a signal before sending the next frame.
To avoid adding event handling logic in the generator, implement a custom event class that uses the caller's thread id to automatically create and manage separate events for each client thread.

class CameraEvent(object):
    """An Event-like class that signals all active clients when a new frame is available.
    """
    def __init__(self):
        self.events = {
    
    }

    def wait(self):
        """Invoked from each client's thread to wait for the next frame."""
        ident = get_ident()
        if ident not in self.events:
            # this is a new client
            # add an entry for it in the self.events dict
            # each entry has two elements, a threading.Event() and a timestamp
            self.events[ident] = [threading.Event(), time.time()]
        return self.events[ident][0].wait()

    def set(self):
        """Invoked by the camera thread when a new frame is available."""
        now = time.time()
        remove = None
        for ident, event in self.events.items():
            if not event[0].isSet():
                # if this client's event is not set, then set it
                # also update the last set timestamp to now
                event[0].set()
                event[1] = now
            else:
                # if the client's event is already set, it means the client
                # did not process a previous frame
                # if the event stays set for more than 5 seconds, then assume
                # the client is gone and remove it
                if now - event[1] > 5:
                    remove = ident
        if remove:
            del self.events[remove]

    def clear(self):
        """Invoked from each client's thread after a frame was processed."""
        self.events[get_ident()][0].clear()

class BaseCamera(object):
    # ...
    event = CameraEvent()

    # ...

    def get_frame(self):
        """Return the current camera frame."""
        BaseCamera.last_access = time.time()

        # wait for a signal from the camera thread
        BaseCamera.event.wait()
        BaseCamera.event.clear()

        return BaseCamera.frame

    @classmethod
    def _thread(cls):
        # ...
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients

            # ...

Overall code:
base_camera.py

import time
import threading
try:
    from greenlet import getcurrent as get_ident
except ImportError:
    try:
        from thread import get_ident
    except ImportError:
        from _thread import get_ident


class CameraEvent(object):
    """An Event-like class that signals all active clients when a new frame is
    available.
    """
    def __init__(self):
        self.events = {
    
    }

    def wait(self):
        """Invoked from each client's thread to wait for the next frame."""
        ident = get_ident()
        if ident not in self.events:
            # this is a new client
            # add an entry for it in the self.events dict
            # each entry has two elements, a threading.Event() and a timestamp
            self.events[ident] = [threading.Event(), time.time()]
        return self.events[ident][0].wait()

    def set(self):
        """Invoked by the camera thread when a new frame is available."""
        now = time.time()
        remove = None
        for ident, event in self.events.items():
            if not event[0].isSet():
                # if this client's event is not set, then set it
                # also update the last set timestamp to now
                event[0].set()
                event[1] = now
            else:
                # if the client's event is already set, it means the client
                # did not process a previous frame
                # if the event stays set for more than 5 seconds, then assume
                # the client is gone and remove it
                if now - event[1] > 5:
                    remove = ident
        if remove:
            del self.events[remove]

    def clear(self):
        """Invoked from each client's thread after a frame was processed."""
        self.events[get_ident()][0].clear()


class BaseCamera(object):
    thread = None  # background thread that reads frames from camera
    frame = None  # current frame is stored here by background thread
    last_access = 0  # time of last client access to the camera
    event = CameraEvent()

    def __init__(self):
        """Start the background camera thread if it isn't running yet."""
        if BaseCamera.thread is None:
            BaseCamera.last_access = time.time()

            # start background frame thread
            BaseCamera.thread = threading.Thread(target=self._thread)
            BaseCamera.thread.start()

            # wait until first frame is available
            BaseCamera.event.wait()

    def get_frame(self):
        """Return the current camera frame."""
        BaseCamera.last_access = time.time()

        # wait for a signal from the camera thread
        BaseCamera.event.wait()
        BaseCamera.event.clear()

        return BaseCamera.frame

    @staticmethod
    def frames():
        """"Generator that returns frames from the camera."""
        raise RuntimeError('Must be implemented by subclasses.')

    @classmethod
    def _thread(cls):
        """Camera background thread."""
        print('Starting camera thread.')
        frames_iterator = cls.frames()
        for frame in frames_iterator:
            BaseCamera.frame = frame
            BaseCamera.event.set()  # send signal to clients
            time.sleep(0)

            # if there hasn't been any clients asking for frames in
            # the last 10 seconds then stop the thread
            if time.time() - BaseCamera.last_access > 10:
                frames_iterator.close()
                print('Stopping camera thread due to inactivity.')
                break
        BaseCamera.thread = None

camera.py

import os
import cv2
from base_camera import BaseCamera


class Camera(BaseCamera):
    video_source = 0

    def __init__(self):
        if os.environ.get('OPENCV_CAMERA_SOURCE'):
            Camera.set_video_source(int(os.environ['OPENCV_CAMERA_SOURCE']))
        super(Camera, self).__init__()

    @staticmethod
    def set_video_source(source):
        Camera.video_source = source

    @staticmethod
    def frames():
        camera = cv2.VideoCapture(Camera.video_source)
        if not camera.isOpened():
            raise RuntimeError('Could not start camera.')

        while True:
            # read current frame
            _, img = camera.read()

            # encode as a jpeg image and return it
            yield cv2.imencode('.jpg', img)[1].tobytes()

app.py

#!/usr/bin/env python
from importlib import import_module
import os
from flask import Flask, render_template, Response

# import camera driver
if os.environ.get('CAMERA'):
    Camera = import_module('camera_' + os.environ['CAMERA']).Camera
else:
    from camera import Camera

# Raspberry Pi camera module (requires picamera package)
# from camera_pi import Camera

app = Flask(__name__)


@app.route('/')
def index():
    """Video streaming home page."""
    return render_template('index.html')


def gen(camera):
    """Video streaming generator function."""
    yield b'--frame\r\n'
    while True:
        frame = camera.get_frame()
        yield b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n--frame\r\n'


@app.route('/video_feed')
def video_feed():
    """Video streaming route. Put this in the src attribute of an img tag."""
    return Response(gen(Camera()),
                    mimetype='multipart/x-mixed-replace; boundary=frame')


if __name__ == '__main__':
    app.run(host='0.0.0.0', threaded=True)

Specific code can refer to my github: Flask-video-Stream

Python uses Flask to transmit video streams

Article Directory

foreword

Streaming

Flask implements streaming

Build a live video stream

Video streaming optimization

Guess you like