通过tensorflow的队列进行输入cv读取的视频数据

这是stackflow上一段通过tensorflow的队列读取视频数据的代码，其中视频数据通过Python中cv模块进行解析，每次读入一阵。

def deform_images(images):
    with tf.name_scope('current_image'):
        frames_resized = tf.image.resize_images(images, [90, 160])
        frame_gray = tf.image.rgb_to_grayscale(frames_resized, name='rgb_to_gray')
        frame_normalized = tf.divide(frame_gray, tf.constant(255.0), name='image_normalization')

        tf.summary.image('image_summmary', frame_gray, 1)
        return frame_normalized

def queue_input(video_path, coord):
    global frame_index
    with tf.device("/cpu:0"):
        # keep looping infinitely

        # source: http://stackoverflow.com/questions/33650974/opencv-python-read-specific-frame-using-videocapture
        cap = cv2.VideoCapture(video_path)
        cap.set(1, frame_index)

        # read the next frame from the file, Note that frame is returned as a Mat.
        # So we need to convert that into a tensor.
        (grabbed, frame) = cap.read()

        # if the `grabbed` boolean is `False`, then we have
        # reached the end of the video file
        if not grabbed:
            coord.request_stop()
            return

        img = np.asarray(frame)
        frame_index += 1
        to_retun = deform_images(img)
        print(to_retun.get_shape())
        return to_retun

frame_num = 1

with tf.Session() as sess:
    merged = tf.summary.merge_all()
    train_writer = tf.summary.FileWriter('C:\\Users\\temp_user\\Documents\\tensorboard_logs', sess.graph)
    tf.global_variables_initializer()

    coord = tf.train.Coordinator()
    queue = tf.FIFOQueue(capacity=128, dtypes=tf.float32, shapes=[90, 160, 1])
    enqueue_op = queue.enqueue(queue_input("RECOLA-Video-recordings\\P16.mp4", coord))

    # Create a queue runner that will run 1 threads in parallel to enqueue
    # examples. In general, the queue runner class is used to create a number of threads cooperating to enqueue
    # tensors in the same queue.
    qr = tf.train.QueueRunner(queue, [enqueue_op] * 1)

    # Create a coordinator, launch the queue runner threads.
    # Note that the coordinator class helps multiple threads stop together and report exceptions to programs that wait
    # for them to stop.
    enqueue_threads = qr.create_threads(sess, coord=coord, start=True)

    # Run the training loop, controlling termination with the coordinator.
    for step in range(8000):
        print(step)
        if coord.should_stop():
            break

        frames_tensor = queue.dequeue(name='dequeue')
        step += 1

    coord.join(enqueue_threads)

train_writer.close()
cv2.destroyAllWindows()

一个回答中提到这个代码会重复读取第一针的数据，而下面给出了解决方法，通过使用placeholder的方式来执行入队操作。

原生回答：
tf.QueueRunner is not the most suitable mechanism for your purposes. In the code you have, the following line

enqueue_op = queue.enqueue(queue_input(“RECOLA-Video-recordings\P16.mp4”, coord))
creates the enqueue_op that will enqueue a constant tensor, namely the first frame returned from the queue_input function every time it is run. Even though QueueRunner is calling it repeatedly, it is always enqueueing the same tensor, namely the one that was provided to it during the operation creation. Instead, you can simply make the enqueue operation take a tf.placeholder as its argument, and run it repeatedly in a loop, feeding it the frame you grabbed via OpenCV. Here is some code to guide you.

frame_ph = tf.placeholder(tf.float32)
enqueue_op = queue.enqueue(frame_ph)

def enqueue():
  while not coord.should_stop():
    frame = queue_input(video_path, coord)
    sess.run(enqueue_op, feed_dict={frame_ph: frame})

threads = [threading.Thread(target=enqueue)]

for t in threads:
  t.start()

# Your dequeue and training code goes here
coord.join(threads)

通过tensorflow的队列进行输入cv读取的视频数据

猜你喜欢