Audio and video concept

  • Audio

    • 1. Sampling rate

      • Refers to the sampling rate on each channel, not the sampling rate of all channels
      • For example: 16000Hz means 16000 acquisitions in a continuous signal in 1s, each time is called a sampling point
    • 2. Sampling bit width (digits)

      • For example: 16bit means that each sampling point collects 2 bytes of data, that is, 2 bytes
    • 3. Channel

      • Common channels are mono and stereo
      • Stereo is composed of L and R channels. We can fill the same data or different data in L and R respectively to achieve stronger sound quality and to hear different sounds in L and R at the same time. Arrangement order: L,R,L,R,L,R...
      • The channel usually has only one L or R data, the order of arrangement is: L, L, L... R, R, R...
    • 4. Audio data size calculation

    • For example: the sampling rate is 16kHz, the bit width is 16bit, and the mono channel is used. What is the size of the data collected in 1 minute?
    •   16000*2*60/1024/1024~=1.83MB
      
    • 5、PCM

      • Raw sound data, quantized format
  • video

    • 1. Frame rate fps

      • The number of image frames displayed per second, or the number of times the graphics processor is updated per second
      • The basic frame rate of a movie greater than or equal to 24fps
    • 2. Bit rate:

      • Also known as bit rate, it refers to the number of bits transmitted per second. Audio also has a bit rate
      • Unit: bps (Bit Per Second), bit unit is too small, so there are kbps, Mbps, Gbps...
      • Audio and video file size calculation
        • File size = bps *dur (bits) /8 (bytes) /1024 (KB) / 1024 (MB)
        • For example, MP3 with a duration of 4 minutes and a bit rate of 128kpbs, size = 128 4 60/(8*1024) = 3MB, the video file size is calculated in the same way
    • 3. Resolution

      • 8k: 7680×4320
      • 4k: 4096 × 2160
      • 2k: 2048x1080
      • 1080P: 1920×1080
    • 4. Refresh rate Hz

      • The vertical refresh rate refers to the number of times the image on the screen is redrawn per second. The higher the refresh rate, the more stable the image, the better it is, the better for the eyes, and it is not easy to fatigue. Not easy to detect flicker and jitter above 75Hz
    • 5. YUV color space

      • It is a color coding method. It is mainly used to optimize the transmission of color video signals for the representation of bare data of video frames, and is backward compatible with old-fashioned black-and-white TVs.
      • Y: brightness, UV: chroma, saturation
      • Humans are relatively insensitive to chroma, so when video coding, the chroma bandwidth will be appropriately reduced
  • Audio and video codes

    • Encoding

      • Through a specific compression technology, the video stream format of a certain video is converted into a video stream method of another video format, an algorithm that reduces bytes
      • 1. Video coding: YUV420/422->H264 RGB888->H264 (for a picture sequence, the coding of a picture is meaningless)
      • 2. Audio coding: PCM (original)->AAC PCM (original)->G726 PCM (original)->G711
    • Decode

      • Through a specific decompression technology, the video stream of a certain video format is converted into a video stream of another video format
      • Hard solution: Rely on hardware for decoding, and decode the video through the video acceleration function of the graphics card. It can be understood that there is a dedicated circuit board to decode the video, which relies on the GPU to reduce CPU consumption.
      • Soft solution: does not rely on a dedicated hardware decoding module, relying on cpu operation to decode, because it is not an independent module, all programs are using the cpu, so it will increase the cpu operation
    • Transcode

      • Video transcoding technology converts video signals from one format to another
      • Video transcoding
        • 1. Resolution switching
        • 2. Change the frame rate
        • 3. Change encoding parameters such as bit rate
      • Audio transcoding
        • 1. Sampling rate switching: when the output and output sampling rate change
        • 2. Change the number of channels
        • 3. Bit width change
  • Timestamp

    • PTS Decode TimeStamp: Decode TimeStamp
    • DTS Presentation Time Stamp: Display time stamp
    • Due to the existence of the B frame in IPB, the next frame will be decoded earlier than the displayed time, resulting in inconsistent output order of PTS and DTS
  • Literacy concept

    • Real-time streaming:

      • Real Time stream Real-time transmission of audio and video streams
    • Video playback:

      • Corresponding to the real-time stream, recorded and played
    • *Server:

      • It serves the client, like the client provides resources and saves the client data.
      • For example: the video recorded by the camera may be viewed by multiple APs, so you need to save the video to a public place for everyone to access
    • Client:

      • It can also be referred to as the client side, which corresponds to the server, and is a program that provides customers with local services.
    • stream media:

      • These are all called streaming services
      • Forward:

        • Transmit data stream to other networks
      • storage:

        • Store data locally
      • Transcoding:

    • Push mode:

      • When the notification message comes, all relevant information is "pushed" to the observer in the form of parameters. (The server can push the stream on the client side, such as a video taken on the mobile phone and put it on the server side)
    • Pull mode:

      • When the notification message comes, the notification function does not carry any relevant information, but requires the observer to actively "pull" the information. (The client reads data directly from a link, such as the camera to view the local video through web streaming)
  • Audio and video streaming concept

    • EN 流 :

      • The original stream, the data stream directly from the encoder.
    • PES stream:

      • The grouping formed by the ES is called the PES group, which is a data layout used to transfer the ES.
    • TS stream:

      • The packet formed by ES is called TS packet, which is a data layout used to transfer ES.
    • rtsp stream:

      • RTSP(Real Time Streaming Protocol)
      • RFC2326, the real-time streaming protocol, is an application layer protocol in the TCP/IP protocol system. (The camera is displayed via the web)
    • rtmp stream:

      • Real Time Messaging Protocol (Real Time Messaging Protocol), is Adobe's protocol (webcast)
    • hls stream:

      • HLS is Apple's dynamic bit rate adaptive technology. Mainly used for audio and video services of PC and Apple terminals. Including a m3u8 index file, TS media fragment file

Encapsulation (mux): Multiplexing, organizing the original video, audio, and video streams in a certain format, adding file headers and file tails

Decapsulation (demux): demultiplex, parse the original audio and video stream according to a certain format

Guess you like

Origin blog.csdn.net/weixin_37921201/article/details/114212806