Based on the PaddleGAN project, facial expression and action transfer learning (5) image supplementary frame coloring and super-resolution repair

learning target

Experience PaddleGAN-based image coloring, stitching, super-scoring and other functions, involving models including DAIN, DeOldify, RealSR, DeepRemaster, EDVR, PPMSVSR, etc.

1. Algorithm principle

1. Complementary frame model DAIN

The DAIN model explicitly detects occlusions by exploring depth information. And a depth-aware stream projection layer is developed to synthesize intermediate streams. It has a better effect in video supplementary frame.
insert image description here
insert image description here

2. Color model DeOldify

insert image description here

3. Coloring model DeepRemaster

The DeepRemaster model is based on a spatio-temporal convolutional neural network and a self-attention mechanism. And it can color the picture according to any number of reference frames input.
insert image description here

4. Super-resolution model RealSRP

The RealSR model designs a novel realistic image downsampling framework for real-world images by estimating various blur kernels as well as the actual noise distribution. Based on this downsampling framework, low-resolution images that share the same domain as real-world images can be acquired. And a real-world super-resolution model aimed at improving perception is proposed. Extensive experiments on synthetic noisy data and real-world images show that the proposed model can effectively reduce noise and improve visual quality.
insert image description here

5. Super-resolution model EDVR

The EDVR model proposes a novel video restoration framework with augmented deformable convolutions: first, a pyramidal, cascaded and deformable (PCD) alignment module designed to handle large motions, using deformable convolutions from coarse The alignment is completed at the feature level in a precise way; second, a spatio-temporal attention mechanism (TSA) fusion module is proposed, which integrates the attention mechanism in both time and space to enhance the restoration function.
insert image description here

6. Video super-resolution model PPMSVSR

Baidu's self-developed PP-MSVSR is a multi-stage video super-resolution depth architecture with a local fusion module, an auxiliary loss, and a refinement alignment module to gradually refine the enhancement results. Specifically, a local fusion module is designed in the first stage to perform local feature fusion before feature propagation to strengthen the fusion of cross-frame features in feature propagation. An auxiliary loss is introduced in the second stage so that the features obtained by the propagation module retain more HR spatially relevant information. A refined alignment module is introduced in the third stage to fully utilize the feature information of the propagation module in the previous stage. A large number of experiments have confirmed that PP-MSVSR has excellent performance in the Vid4 data set, and the PSNR index can reach 28.13dB with only 1.45M parameters.

PP-MSVSR provides two volume models, developers can flexibly choose according to the actual scene: PP-MSVSR (parameter 1.45M) and PP-MSVSR-L (parameter 7.42)

ppgan.apps.PPMSVSRPredictor(output='output', weight_path=None, num_frames)
ppgan.apps.PPMSVSRLargePredictor(output='output', weight_path=None, num_frames)
parameter
output_path (str, optional): output folder Path, default value: output.weight_path
(None, optional): The weight path to load, if not set, the default weight will be downloaded from the cloud to the local. Default value: None.
num_frames (int, optional): The number of model input frames, default value: 10. The larger the number of input frames, the better the model super-resolution effect.

7. BasicVSR series of video super-resolution models

BasicVSR reconsiders some of the most important components of the four basic modules (i.e., propagation, alignment, aggregation, and upsampling) under the guidance of VSR. By adding a few minor designs and reusing some existing components, a clean BasicVSR was obtained. BasicVSR achieves attractive improvements in speed and restoration quality compared to many state-of-the-art algorithms.

Meanwhile, by adding an information refill mechanism and a coupled propagation scheme to facilitate information aggregation, BasicVSR can be extended to IconVSR, which can serve as a strong baseline for future VSR methods.

BasicVSR++ redesigns BasicVSR by proposing second-order mesh propagation and diversion deformable alignment. By enhancing the recurrent framework with enhanced propagation and alignment, BasicVSR++ can more effectively utilize the spatio-temporal information of unaligned video frames. Under similar computational constraints, the new components improve performance. In particular, BasicVSR++ outperforms BasicVSR by 0.82dB in terms of PSNR with a similar number of parameters. BasicVSR++ won three champions and one runner-up in the NTIRE2021 Video Super-Resolution and Compressed Video Enhancement Challenge.

ppgan.apps.BasicVSRPredictor(output='output', weight_path=None, num_frames)
ppgan.apps.IconVSRPredictor(output='output', weight_path=None, num_frames)
ppgan.apps.BasiVSRPlusPlusPredictor(output='output', weight_path = None, num_frames)
parameter
output_path (str, optional): output folder path, default value: output.
weight_path (None, optional): weight path to load, if not set, the default will be downloaded from the cloud weights to local. Default value: None.
num_frames (int, optional): The number of model input frames, default value: 10. The larger the number of input frames, the better the model super-resolution effect.


2. Experience effect

1. Environment preparation:

Please refer to another article, based on the PaddleGAN project facial expression action transfer learning (1) environment configuration , download PaddleGAN, and configure the environment, in addition, you need to install related modules, enter the PaddleGAN directory:

cd PaddleGAN
pip install -r requirements.txt
pip install -v -e .
pip install dlib
pip install ppgan

If the installation of dlib fails, please refer to another article to record the process of solving the failure of installing the dlib library, know conda-forge , and hope to help.
It may take a few minutes for the installation to complete depending on the network speed.

2. Prepare old photos:

Download the video data of Xiaobing Zhang Ga, the screenshot is as follows
insert image description here

3. Experience effect:

3.1 DeOldify coloring

python tools/video-enhance.py --input /home/work/z2.png    --process_order  DeOldify --output /home/work/output

Effectinsert image description here

3.2 RealSR Super Resolution

python tools/video-enhance.py --input /home/work/z2.png    --process_order RealSR  --output /home/work/output

The effect
insert image description here
does not seem to be visible to the naked eye.

3.3 Other models

The usage of the command line is similar and can be used at the same time, for example

Use the three models of interpolation (DAIN), coloring (DeOldify), and super-resolution (MSVSR) to repair the video. The input parameter
indicates the input video path.
The output indicates the storage folder of the processed video.
proccess_order indicates the model used and sequence (currently supported)

python tools/video-enhance.py --input /home/work/xiaobing.mp4    --process_order  DAIN DeOldify PPMSVSR  --output /home/work/output

It is inconvenient to upload the video, and the effect will not be displayed.

Summarize

I have experienced the use of PaddleGAN to complete video coloring, super-resolution, and frame interpolation. The application of PaddleGAN will not stop here. It is not enough to just adjust the package and get it done with one line of code. If you want to do it better, you need Carefully read the corresponding paper of the model used, try to reproduce it yourself or carefully read the source code provided in PaddleGAN.

Reference: https://aistudio.baidu.com/aistudio/projectdetail/1161285

Guess you like

Origin blog.csdn.net/h363924219/article/details/122411713