Research Report on the Optical Image Stabilization

电子科技大学 格拉斯哥学院 2017/2018级 席文正

Introduction

Image stabilization is the technique of improving image quality by actively removing the apparent motion of an object induced by vibration, tracking errors, and differential refraction in the atmosphere. It is the apparent motion of the object because usually the object itself is quite stable, yet in the imaging system the object appears to be moving. The result of using image stabilization is an image that is sharper and has higher contrast and resolution.
Nowadays, it is common for smart phone to be equipped with image stabilization. I have found out that actually it is hard to achieve electronic anti shake on smart phone, where compoments are limited in size. As a result, optical image stabilization is more frequently used on smart phone. However, the most obvious shortcoming of it is that the quality of the picture would be belower. In this research, I would like to try to find a way to solve the problem.

Research

The most important thing of this technology uses pixels outside the border of the visible frame to provide a buffer for the motion. This technique reduces distracting vibrations from videos by smoothing the transition from one frame to another. This technique does not affect the noise level of the image, except in the extreme borders when the image is extrapolated. It cannot do anything about existing motion blur, which may result in an image seemingly losing focus as motion is compensated.

Some manufacturers now use digital signal processing (DSP) to reduce blur in stills, for example by sub-dividing the exposure into several shorter exposures in rapid succession, discarding blurred ones, re-aligning the sharpest sub-exposures and adding them together, and using the gyroscope to detect the best time to take each frame.

Whatever it uses, the photo quality surely will be damaged.

Solution and Suggestion

The key to the loss of photo quality I believe is background subtraction. Background subtraction is a widely used concept to detect moving objects in videos taken from a static camera. In the last two decades, several algorithms have been developed for background subtraction and were used in various important applications such as visual surveillance, analysis of sports video, marker-less motion capture, etc. Various statistical approaches have been proposed to model scene background.

The definition of what is foreground and what is background is ambiguous in the case of a video captured from a moving camera. A successful moving-camera background subtraction should utilize and integrate various perceptual cues to segment the fore-
ground. The cues should include the following:

Motion discontinuity: Objects and image regions that are moving independently from the rest of the scene.

Depth discontinuity: Objects that are closer to the camera than the rest of the scene.

Appearance discontinuity: Salient objects or image regions that look different from the surrounding.

Familiar shape: Image regions that look like familiar shapes; i.e., object of interest such as people, vehicle, and animals.

None of these cues by itself is enough to solve the problem, and, therefore, integrating these four cues, besides high-level reasoning about the scene semantics, is necessary.

The decision of foreground/background separation is not only low-level discontinuity detection, but also a high-level semantic decision. In what follows we justify our argument.

There is a large literature on motion segmentation, which exploits motion discontinuity,
however, these approaches do not necessarily aim at modeling scene background and segmenting the foreground layers. Fundamentally 3D motion segmentation by itself is not enough to separate the foreground from the background in case both of them constitute a rigid or close to rigid motion, e.g., a car parked in the street will have the same 3D motion as the rest of the scene.

Apparent 2D motion (optical flow) is also not enough to properly segment objects; for example an articulated object will exhibit different apparent motion at different parts.
Similarly, depth discontinuity by itself is not enough, since objects of interest can be at a
distance from the camera without much depth difference from the background. There has been a lot of interest recently in saliency detection from images as a pre-step for object categorization.

However, most of these works use static images and do not exploit rich video information. The performance of such low-level saliency detection is far from being reliable in detecting foreground objects accurately. The above discussion highlights the need for integrated approaches the problem that combine low-level vision cues, namely: motion discontinuity, depth discontinuity, and appearance saliency with high-level object and scene accumulated and learned models.

猜你喜欢

转载自blog.csdn.net/qq_40633177/article/details/84641638