阿里云天池大赛赛题(深度学习)——视频增强(完整代码)

赛题背景

视频增强和超分辨率是CV的核心算法之一,对早期胶片视频的质量和清晰度的提升有着重大意义。本题就是给一堆视频(低分辨率和高分辨率),利用训练得到的模型将低分辨率视频预测得到高分辨率视频。

全代码

导入工具包

import cv2
import numpy as np
import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Conv2DTranspose
from tensorflow.keras.layers import InputLayer
from tensorflow.keras.models import Sequential

读取图片

path = "./h_GT/Youku_00000_h_GT/001.bmp"
img_GT = cv2.imread(path)/255.0
img_GT.shape
path = "./l/Youku_00000_l/001.bmp"
img_l = cv2.imread(path)/255.0
img_l.shape

实现FSRCNN网络

def fsrcnn():

    model = Sequential()
    model.add(InputLayer(input_shape=(270, 480, 3)))
    
    # first_part
    model.add(Conv2D(56, 5, padding='same', activation='relu'))
    
    # mid_part
    model.add(Conv2D(12, 1, padding='same', activation='relu'))
    for i in range(4):
        model.add(Conv2D(12, 3, padding='same', activation='relu'))
        
    # last_part
    model.add(Conv2DTranspose(3, 9, strides=4, padding='same',))
    
    model.compile(optimizer=tf.optimizers.Adam(1e-1), loss=tf.losses.mse, metrics=['mse'])
    return model

FSRCNN

FSRCNN (Faster Super-Resolution Convolutional Neural Network) 解决了SRCNN的耗时问题。FSRCNN使用反卷积来替代插值的预处理进行上采样,可以直接进行端到端的学习;FSRCNN使用ResNet bottleneck 架构来提高模型精度,使用更小的卷积核和更多的卷积层来替代大的卷积核。所以在生成不同高清分辨率图片时,FSRCNN只需调节用于上采样的反卷积权重,其余卷积层不变,大大加快训练速度,甚至做到实时。

FSRCNN模型训练

# 使用模型
model = fsrcnn()

# 模型监控:自动调节学习率
plateau = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', verbose=0, mode='min', factor=0.10, patience=6)
# 模型在验证集达到最优停止
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', verbose=0, mode='min', patience=25)
# 模型在最优点保持
checkpoint = keras.callbacks.ModelCheckpoint('fsrcnn.h5', monitor='val_loss', verbose=0, mode='min', save_best_only=True)

# 训练数据
x = np.array([img_l,img_l])
y = np.array([img_GT,img_GT])

# 模型训练
model.fit(x, y, epochs=10, batch_size=2, verbose=1, shuffle=True, validation_data=(x, y), callbacks=[plateau, early_stopping, checkpoint])

FSRCNN模型验证

model.evaluate(x, y, verbose=0)

FSRCNN模型预测

pic_super = model.predict(x, verbose=0, batch_size=1)

保存图片查看

cv2.imwrite("./fsrcnn_00.bmp", pic_super[0])

ESPCN

ESPCN(Efficient Sub-Pixel Convolutional Neural Network)吸收了FSRCNN的精华,它只在模型末端使用亚像素卷积的方式进行上采样,这样可以在低分辨率空间中保留更多的纹理区域,也可以在视频超分中做到实时。

实现ESPCN网络

def espcn():
    inputs = keras.layers.Input(shape=(270, 480, 3))
    cnn = keras.layers.Conv2D(64, 5, padding='same', activation='relu')(inputs)
    cnn = keras.layers.Conv2D(32, 3, padding='same', activation='relu')(cnn)
    cnn = keras.layers.Conv2D(3 * 4 **2, 3, padding='same')(cnn)
    cnn = tf.reshape(cnn, [-1, 270, 480, 4, 4, 3])
    cnn = tf.transpose(cnn, perm=[0, 1, 3, 2, 4, 5]) 
    outputs = tf.reshape(cnn, [-1, 270 * 4, 480 * 4, 3])
    
    model = keras.models.Model(inputs=[inputs], outputs=[outputs])
    model.compile(optimizer=tf.optimizers.Adam(1e-1), loss=tf.losses.mse, metrics=['mse'])
    return model

ESPCN模型训练

# 使用模型
model = espcn()

# 模型监控:自动调节学习率
plateau = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', verbose=0, mode='min', factor=0.10, patience=6)
# 模型在验证集达到最优停止
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', verbose=0, mode='min', patience=25)
# 模型在最优点保持
checkpoint = keras.callbacks.ModelCheckpoint('espcn.h5', monitor='val_loss', verbose=0, mode='min', save_best_only=True)

# 训练数据
x = np.array([img_l,img_l])
y = np.array([img_GT,img_GT])

# 模型训练
model.fit(x, y, epochs=10, batch_size=2, verbose=1, shuffle=True, validation_data=(x, y), callbacks=[plateau, early_stopping, checkpoint])

ESPCN模型验证

model.evaluate(x, y, verbose=0)

ESPCN模型预测

pic_super = model.predict(x, verbose=0, batch_size=1)

保存图片查看

cv2.imwrite("./espcn_00.bmp", pic_super[0])

以上内容和代码全部来自于《阿里云天池大赛赛题解析(深度学习篇)》这本好书,十分推荐大家去阅读原书!

猜你喜欢

转载自blog.csdn.net/weixin_45116099/article/details/126204328