The actual combat of video frame-cutting architecture based on serverless technology

Preface

Live video streaming is an innovative form of online entertainment, with multi-person real-time interaction characteristics, and has a very wide range of applications in many industries such as e-commerce, games, online education, and entertainment. With the continuous improvement of network infrastructure and the growing demand for social entertainment, live video streaming continues to penetrate everyone's daily life and occupy the fragmented leisure time of users. The technical support capabilities of live video streaming are also constantly improving, which has promoted the growth of the live video market from 21.25 billion yuan in 2014 to 54.85 billion yuan in 2020, and will continue to develop rapidly at a growth rate of about 12.8% in the next five years.

Overview of video frame cutting requirements

The live broadcast industry is subject to more and more laws, regulations and policies. Under the constraints of general industry standards and operating procedures, every live broadcast platform is obliged to deal with illegal live broadcast content and improper interaction between anchors and viewers. Take measures to contribute to the more standardized development of the live broadcast industry. How to detect the illegal content in the live stream in the first time is a common challenge that the live broadcast platform needs to face. Video frame cutting is a routine operation to meet the needs of content review. The video frame can be cut at different frequencies according to the different risk levels of the live video. The saved pictures can be uploaded to self-built or third-party content review platforms for pornographic, political, and advertising purposes. Wait for the recognition of the scene. In addition, some specific business requirements also need to be realized through video screenshots, such as online classroom applications for intelligent analysis of students' listening status.

Analysis of video frame cutting technology architecture

The frame cutting operation of the video stream can be realized through the FFmpeg command. FFmpeg's frame capture command is very simple to use. Every time a picture is captured, the image can be uploaded to the object storage OSS, and the corresponding frame capture information is sent to the message queue Kafka. In this way, the audit service (which can be a third-party service or a self-built service) can obtain the frame information from Kafka, and pull the corresponding image from the OSS for processing. In this architecture, Kafka is introduced to ease the load of the audit service during peak business periods through an asynchronous processing mechanism.

Although FFmpeg is simple to use, it is an operation that requires a lot of CPU computing power. If the video stream is clipped at a fixed frequency of 1 second, a 16-core ECS can probably undertake the task of cutting frames for 100 video streams at the same time. In order to ensure service stability during peak business hours, a large number of ECSs need to be prepared to deploy video frame cutting services. However, most Internet applications have obvious peaks and valleys. For example, the prime time of every night is the peak of business, and the business volume after 24:00 will show a significant downward trend. Such business fluctuations have brought great challenges to the overall resource planning. If the frame interception service is deployed according to a fixed ECS cluster scale, there will be two very obvious drawbacks:

  1. In order to support business peaks, the cluster size must be evaluated according to the number of users during peak periods, which will cause huge waste during low peak periods.
  2. In some scenarios, such as the star effect, the business volume will increase suddenly, and it may be necessary to temporarily expand the cluster. In this case, the expansion speed will often lag behind the growth rate of the business flow, resulting in degraded processing of some businesses. .

In order to better improve resource utilization, applications can also be deployed in a containerized manner through elastic ECS instances to dynamically adapt the cluster scale to changes in real business volume. However, in actual situations, the implementation of such a scheme's elastic scaling strategy is more complicated, and the elastic scaling capability is relatively lagging, and the effect may not be too good. The fundamental reason is that in the traditional service architecture, an application keeps running for a long time after it is started, and will concurrently process multiple business requirements during operation. No matter how the business volume changes, the computing power occupied by this application will not be essential. The change.

Is there a straightforward way to pull up the corresponding computing power to undertake the task of intercepting the frame after a live video stream is opened, and automatically release the computing power after the video stream is closed? This method does not require application instances to be permanently resident, can achieve true on-demand distribution of computing resources, and does not require additional means to dynamically adjust the cluster size of the frame-cutting service. It is the most ideal solution.

As a representative of cloud-native serverless technology, Alibaba Cloud Function Computing FC has just achieved this idea.

Serverless architecture based on function computing FC

Function Computing FC is an event-driven fully managed computing service. Using functional computing, users do not need to purchase and manage infrastructure such as servers, just write and upload code. Function computing will automatically prepare computing resources, run tasks flexibly and reliably, and provide functions such as log query, performance monitoring, and alarms. With the help of function calculation FC, you can quickly build any type of applications and services, and you only need to pay for the resources actually consumed by the task.

Function calculation FC provides an event-driven calculation model, and the execution of functions is driven by events. The execution of the function can be triggered by the user of the function or by some other event source. You can create a trigger in a specified function. The trigger describes a set of rules. When an event meets these rules, the event source will trigger the corresponding function. For example, for HTTP triggers, a user's HTTP request can trigger a function; for OSS triggers, a new or modified file on OSS can trigger a function. In the video frame-cutting scene, the function only needs to actively trigger a frame-cutting function through the business program before each live stream starts to be pushed. Therefore, the original frame-cutting architecture can be migrated to the functional computing platform with only minor adjustments to enjoy the value of Serverless.

Realization of Serverless Architecture Video Frame Cutting Technology

Now, we use a few simple steps to build a serverless architecture based on function computing FC to meet the video frame requirements. Functional computing FC provides a native operating environment for multiple languages ​​such as Node.js, Python, PHP, Java, etc., especially scripting languages ​​like Python, which can directly modify the scheduling code on the functional computing platform, which is very simple to use. The sample code in this article is implemented in Python.

Of course, functional computing FC has no requirements for development languages, and any mainstream development language can be well supported. The Custom Runtime provided by the function calculation FC can establish a customized operating environment for the task language. Custom Runtime is essentially an HTTP Server. This HTTP Server takes over all requests from the function computing system, including event calls or HTTP function calls.

Output video stream

We can develop through a third-party video streaming service, but in order to make it easier to debug locally, we can realize the output of the video stream through our self-built RTMP service. The simpler way is to buy an ECS and deploy Nginx to implement RTMP services. This requires loading the nginx-rtmp-module module. We can find many related tutorials on the Internet, and this article will not repeat them.

With the RTMP service, we can go to http://ffmpeg.org/ to download the compiled ffmpeg package, and push the local video file to the RTMP service through the FFmpeg command. For example, use the following method:

ffmpeg -re -i test.flv -vcodec copy -acodec aac -ar 44100 -f flv rtmp://xxx.xxx.xxx.xxx:1935/stream/test 

Next, we open the browser, enter the corresponding RTMP live broadcast address, and then we can pull up the corresponding player to watch the live broadcast. rtmp://xxx.xxx.xxx.xxx:1935/stream/test

Install Funcraft

Funcraft is a tool that supports serverless application deployment, which can help users conveniently manage resources such as function computing, API gateways, and log services. Funcraft can implement development, construction, deployment and other operations through a resource configuration file template.yml, which can greatly reduce the workload of configuration and deployment when we use function computing FC to implement the serverless architecture.

There are three ways to install Funcraft, including npm package management installation, download binary installation, and Homebrew package manager installation. For environments where npm is not installed, the easiest way is to install by downloading the binary. We can download the Funcraft installation package of the corresponding platform through https://github.com/alibaba/funcraft/releases , and then you can use it after decompression. You can check whether the Funcraft package has been installed successfully with the following command:

fun --version

If the version number corresponding to Funcraft is returned after executing the command, such as 3.6.20, it means the installation is successful.

Before using fun for the first time, you need to execute  fun config commands for initial configuration. This operation needs to provide general information such as Alibaba Cloud Account ID, Access Key Id, Secret Access Key, Default Region Name, etc. These information can be viewed from the upper right of the function computing console homepage. Fang obtained. Other information such as timeout can directly use the default value.

Configure OSS

Since the files saved after the screenshot are uploaded to the object storage OSS for backup, we need to activate the Alibaba Cloud OSS service and create the corresponding bucket. For specific operations, we can refer to https://www.aliyun.com/product/oss to complete.

Configure Log Service SLS

Log Service (Log Service) is a one-stop service for log data provided by Alibaba Cloud. To store function logs through the Log Service, you need to configure log items and log warehouses in the service corresponding to the function, and grant the service access to the log service. Permissions. Function logs will be printed to the configured log warehouse, and all function logs under the same service will be printed to the same log warehouse. You can store the log of function execution to the Alibaba Cloud Log Service, and then perform operations such as code debugging, fault analysis, and data analysis based on the function log stored in the log service.

We can refer to the creation of log projects and log warehouses to configure the log service SLS. To ensure that the log projects and log warehouses have been successfully created, the information of the log projects and log warehouses needs to be used when deploying functions.

Write function

Now we use a piece of the simplest Python code to experience how to use function calculation FC to realize the frame interception operation. In order to make it easier for readers to understand, we temporarily simplify the business logic and only do the following two actions:

  1. Capture 1 picture through FFmpeg command
  2. Save to OSS
import json, oss2, subprocess

HELLO_WORLD = b'Snapshot OK!\n'
OSS_BUCKET_NAME = b'snapshot'

def handler(environ, start_response):
    logger = logging.getLogger() 
    context = environ['fc.context']
    request_uri = environ['fc.request_uri']
    for k, v in environ.items():
        if k.startswith('HTTP_'):
            pass
    try:        
        request_body_size = int(environ.get('CONTENT_LENGTH', 0))
    except (ValueError):        
        request_body_size = 0
    #获得直播流的地址
    rtmp_url = request_body.decode("UTF-8")
    #通过FFmpeg命令截取一张图片
    cmd = ['/code/ffmpeg', '-i', rtmp_url, '-frames:v', '1', '/tmp/snapshot.png' ]
    try:
        subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
    except subprocess.CalledProcessError as exc:
        err_ret = {'returncode': exc.returncode, 'cmd': exc.cmd, 'output': exc.output.decode(),'stderr': exc.stderr.decode()}
        print(json.dumps(err_ret))
    raise Exception(context.request_id + ' transcode failure')
    #上传到OSS
    creds = context.credentials
    auth = oss2.StsAuth(creds.access_key_id, creds.access_key_secret, creds.security_token)
    bucket = oss2.Bucket(auth, 'http://oss-{}-internal.aliyuncs.com'.format(context.region), OSS_BUCKET_NAME)
    logger.info('upload pictures to OSS ...')
    for filename in os.listdir("/tmp"): 
        bucket.put_object_from_file("example/" + filename, "/tmp/" + filename)
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]

Let's analyze this code. First of all, in addition to the standard modules of Python, the Python runtime environment of Function Computing FC also contains some commonly used modules. In fact, it includes oss2, which is used to operate Alibaba Cloud Object Storage OSS in functions. Therefore, we can directly introduce the oss2 module into the code.

Function calculation FC integrates multiple types of triggers. This example function uses HTTP triggers. Each HTTP request triggers the execution of a function. For Python code that uses HTTP triggers, the entry function is handlerthat the environ parameter carries information about the client that calls the function and context information. We can parse out the address of the STMP live stream from the HTTP request Body, and intercept a picture through the FFmpeg command.

In this code, the ffmpeg executable program is located in the /codedirectory and can /code/ffmpegbe executed through the path. This is because when we deploy the function, we have packaged the ffmpeg executable program and this code in this directory. When we introduce the function deployment, we will further introduce how to integrate the function code and the executable program. Pack together.

In the process of uploading the image files saved in the /tmp directory to OSS, we can directly obtain the credentials for accessing OSS from the function context, so that there is no need to obtain accessKey, accessSecret and other information through the configuration file, thereby reducing the workload.

Deployment function

First, we create a working directory locally, and create a subdirectory named code under this directory, and copy the ffmpeg executable file of the Linux environment to the codedirectory, so /code/ffmpegthat the ffmpeg command can be invoked in the code through the path .

Next, start the most important work, create a template.ymlfile in the current working directory , describe all the deployment information.

ROSTemplateFormatVersion: '2015-09-01'
Transform: 'Aliyun::Serverless-2018-04-03'
Resources:
  #服务
  snapshotService:
    Type: 'Aliyun::Serverless::Service'
    Properties:
      Description: 'Snapshot Semo'
      Policies:
        - AliyunOSSFullAccess
      #之前创建的日志项目和日志仓库
      LogConfig:
        Project: fc-bj-pro
        Logstore: fc-log
    #函数
    snapshot:
      Type: 'Aliyun::Serverless::Function'
      Properties:
        Handler: index.handler
        Runtime: python3
        MemorySize: 128
        Timeout: 600
        CodeUri: './code'
      # HTTP触发器
      Events:
        http-test:
          Type: HTTP
          Properties:
            AuthType: ANONYMOUS
            Methods: ['POST']

The configuration information is relatively simple, we need to define a service first. Service is the unit of functional computing resource management. Starting from the business scenario, an application can be split into multiple services. Starting from the resource usage dimension, a service can be composed of multiple functions. For example, a data processing service is divided into two parts: data preparation and data processing. The data preparation function requires small resources, and you can choose small-sized instances. Data processing functions require large resources, and large-scale instances can be selected. You must create a service before creating a function. All functions under the same service share some of the same settings, such as service authorization and log configuration. In this code, the name of the service we created snapshotServicehas all operating permissions on OSS and references the log project and log warehouse created before.

In the configuration of function instance specifications, since each computing instance only needs to process one video stream, we choose the lowest specification, which is an instance with 128M memory.

Next, we have to define a function, configure its corresponding operating environment, entry method, code directory, timeout time and other information, and define an HTTP trigger for this function. In this code, the function is named snapshot, the corresponding operating environment is Python3, and a named http-testHTTP trigger is defined .

In this working directory, execute it fun deploy. If you see a prompt server SnapshotService deploy success, it means that the code and ffmpeg program have been packaged and deployed to the cloud.

In the service and function menu of the console , we can see the uploaded service and function information, and even view and modify the function code online.

Execution function

Since this is an HTTP type function, we can use curl commands or other HTTP tools such as Postman to initiate an HTTP request to the function calculation FC to verify the execution result of the frame interception operation. Of course, the function computing FC console also provides a visual operation interface to verify the function, in which an HTTP request can be quickly initiated.

If the function is executed successfully, we can go to the OSS console to check whether the captured image has been uploaded successfully. So far, we have built the most basic serverless video frame interception architecture, which can intercept a picture of the video stream through the HTTP request trigger function calculation, and upload it to OSS.

Continuous frame cut

The frame cutting operation of a single picture is very simple. After the FFmpeg command is executed, you can directly upload the picture in the temporary folder to OSS, and then complete the life cycle of the function. Single picture frame capture can already meet many business scenarios, but if you need to continuously capture frames at a fixed frequency and upload the saved picture to OSS in real time, you need to make some changes to the code.

Configure message queue Kafka

In order to reduce the workload of the content review service during the peak business period, we can introduce the message queue Kafka between the frame interception service and the content review service, so that the content review service can asynchronously synchronize the saved pictures by consuming the messages received from Kafka deal with. In the video frame-cutting architecture, Kafka plays a very important role in information transfer. The greater the concurrency of the live broadcast and the higher the frame-cutting frequency, the greater the pressure on Kafka. Especially in the peak period of business, it is necessary to keep Kafka stable in high-load work. Directly using the message queue Kafka provided by Alibaba Cloud can help us greatly reduce the maintenance workload of Kafka clusters, and use the simplest way to obtain dynamic expansion. Highly available Kafka service.

We can open the Kafka activation interface and purchase the Kafka instance of the corresponding specification according to the needs of the actual scenario. In the basic information of the Kafka console , we can see the default access point corresponding to the Kafka instance.

Next, we enter the Topic management interface and create a Topic for frame interception service.

The default access point and topic name of the Kafka instance are the information we need to use in the next steps.

Install Kafka client SDK

Before that, we also need to pass some additional operations to obtain the function's ability to write to Kafka.

Because we need to use the Kafka SDK, we can install the Kafka SDK module through the Funcraft tool combined with the Python package management tool pip:

fun install --runtime python3 --package-type pip kafka-python

After executing the command, there is the following prompt message:

At this point, we will find that a .fun folder will be generated in the directory, and the dependency package we installed is in this directory:

Open up the ability to access resources in the VPC

By default, function computing cannot access the resources in the VPC. Because we need to allow the function to access the Kafka service deployed in the VPC, we need to manually configure the VPC function and related permissions for the service. We can refer to the configuration function to access resources in the VPC and open up the connection between the function and the Kafka service. The principle is to grant

The elastic network card ENI has the permission to access the VPC, and the elastic network card ENI is inserted into the instance that executes the function, so that the function can access the resources in your VPC.

Code

You can use the following FFmpeg command to achieve frequent and continuous frame interception as specified:

ffmpeg -i rtmp://xxx.xxx.xxx.xxx:1935/stream/test -r 1 -strftime 1 /tmp/snapshot/%Y%m%d%H%M%S.jpg

While the command is running, the current process of the Python program will wait for the end of the video stream push, so we need to modify the function code to start a new scanning process. The scanning process constantly checks the picture catalog. Once it finds that a new picture is generated, it uploads the picture to OSS, and sends the intercepted frame information to Kafka, and finally deletes the picture from the picture catalog.

import logging, json, oss2, subprocess
from multiprocessing import Process
from kafka import KafkaProducer

HELLO_WORLD = b'Snapshot OK!\n'
OSS_BUCKET_NAME = b'snapshot'
logger = logging.getLogger()
output_dir = '/tmp/shapshot'
# 扫描图片目录
def scan(bucket, producer):
    flag = 1
    while flag:
        for filename in os.listdir(output_dir):
            if filename == 'over':
                # ffmpeg命令完成,准备停止扫描
                flag = 0
                continue
            logger.info("found image: %s", snapshotFile)
            try: 
                full_path = os.path.join(output_dir, filename)
                # 上传到OSS
                bucket.put_object_from_file("snapshot/" + filename, full_path)
                # 发送到Kafka
                producer.send('snapshot', filename.encode('utf-8'))
                # 删除图片
                os.remove(full_path)
            except Exception as e:
                logger.error("got exception: %s for %s", e.message, filename)
        time.sleep(1)

def handler(environ, start_response):
    logger = logging.getLogger() 
    context = environ['fc.context']
    #创建图片输出文件夹
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    #解析HTTP请求,获得直播流的地址
    request_uri = environ['fc.request_uri']
    for k, v in environ.items():
        if k.startswith('HTTP_'):
            pass
    try:        
        request_body_size = int(environ.get('CONTENT_LENGTH', 0))
    except (ValueError):        
        request_body_size = 0
    rtmp_url = request_body.decode("UTF-8")
    #启动Kafka Producer
    producer = KafkaProducer(bootstrap_servers='XX.XX.XX.XX:9092,XX.XX.XX.XX:9092')
    #启动OSS Bucket
    creds = context.credentials
    auth = oss2.StsAuth(creds.access_key_id, creds.access_key_secret, creds.security_token)
    bucket = oss2.Bucket(auth, 'http://oss-{}-internal.aliyuncs.com'.format(context.region), OSS_BUCKET_NAME)
    #启动扫描进程
    scan_process = Process(target=scan, args=(bucket, producer))
    #通过FFmpeg命令按每秒1帧的频繁连续截帧
    cmd = ["/code/ffmpeg", "-y", "-i", rtmp_url, "-f", "image2", "-r", "1",
        "-strftime", "1", os.path.join(output_dir, "%Y%m%d%H%M%S.jpg")]
    try:
        subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True)
    except subprocess.CalledProcessError as exc:
        err_ret = {'returncode': exc.returncode, 'cmd': exc.cmd, 'output': exc.output.decode(),'stderr': exc.stderr.decode()}
        logger.error(json.dumps(err_ret))
    raise Exception(context.request_id + ' transcode failure')
    #写入标志文件,子进程结束工作
    os.system("touch %s" % os.path.join(output_dir, 'over'))
    scan_process.join()
    producer.close()
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]

further optimization

Long video clip

The default flexible instance of function calculation FC is 600 seconds, which is the upper limit of the function execution time of 10 minutes. That is to say, after a function is triggered, if the calculation task has not been completed for 10 minutes, it will automatically exit. This restriction will affect the frame cutting operation of video streams with a playback time of more than 10 minutes. Long videos are very common. How to bypass this restriction and cut frames for long videos? We can solve it through the following three solutions:

  1. Each function only cuts 1 frame: when the frequency of cutting frames is relatively low, or when the video stream only needs to be cut at a few specific points in time, we do not need to keep the life cycle of the function and the playback cycle of the video stream Consistent, each function can only capture a single frame of pictures after it is started. Through the custom trigger program, you can start the function at the necessary point in time, or you can use the serverless workflow to arrange more complex functions. For more information about the serverless workflow, please refer to https://www.aliyun.com /product/fnf
  2. Completed through multiple function relays: Function calculation FC has built-in fc2modules, which can be used for mutual calls between functions. In this way, we can control the running time of each frame cutting function within 10 minutes, for example, 8 minutes is a fixed running period. Before the end of a function, start another function relay to complete the task of cutting frames until the end of the video stream. This solution is very suitable for scenarios where the accuracy of the frame cut frequency is not particularly high, because when the two functions are handed over, there will be a time of about one second, and the accuracy of the frame cut frequency cannot be strictly guaranteed.
  3. Use performance instances: In addition to the default elastic instances, Function Computing FC also provides performance instances. The performance strength belongs to large-scale instances, with higher resource limits, more adaptation scenarios, and the ability to exceed the 10-minute execution time limit. The expansion speed of performance instances is slow, and the elastic scalability is not as good as that of elastic instances. However, we can improve the flexibility of performance instances through the coordination of single instance multiple concurrency and reservation mode. For specific introduction, please refer to Single Instance Multiple Concurrency and Reservation Mode .

Cost optimization

Function computing provides a wealth of measurement models, competitive pricing, and detailed resource usage indicators. Combined with Serverless's application-centric architecture, resource management is unprecedentedly convenient and highly competitive in different scenarios. cost.

According to the difference in resource specifications and flexibility requirements, the function calculation provides two measurement modes: prepayment (annual and monthly subscription) and post-payment (pay-as-you-go). Under normal circumstances, you only need to use the pay-as-you-go model, you only need to pay for the actually used function computing resources, and you do not need to purchase resources in advance. However, users can flexibly choose the prepaid mode to save usage costs according to the actual daily resource usage. The prepaid model means that the user purchases a certain amount of computing power in advance. During the life cycle of the pre-purchased computing power, the resources consumed when the function is running can be deducted every second. The unit price of the prepaid model is never lower than that of the postpaid model.

On the Resource Center page of the Function Computing Console, you can see the actual usage of resources under the current account at a glance, including the stable and flexible part of resource usage. Through this information, prepaid and postpaid resources can be reasonably allocated. In the resource usage details graph, the green curve represents the actual daily resource usage, and the yellow straight line represents the usage that can be deducted by prepaid resources. We can appropriately increase the proportion of prepaid resources according to the actual situation to make more Resource usage is covered by prepaid resources, thereby reducing resource costs for sorting.

to sum up

In the scene of video frame cutting, the value of serverless technology is very obvious. The innovative instance scheduling engine of function computing maximizes the advantages of cloud computing in terms of efficiency, performance, cost, and openness. As of February 2021, more than 5 large Internet companies have begun to implement video capture based on function computing FC. Under different frame capture requirements, it can save at least 20% of the use cost compared with traditional ECS-based deployment services, and Can greatly reduce the workload of system maintenance. In terms of migration and transformation, they can complete all the processes of pre-research, development, debugging, testing, and launch within a week, and begin to enjoy the huge dividends brought by the serverless technology in the cloud computing era.

 

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/weixin_43970890/article/details/113883146