Arranging and Thinking about Event Cameras

 
  

Click on " Xiaobai Learning Vision " above , and choose to add " Star " or " Top "

重磅干货,第一时间送达

This article is reproduced from New Machine Vision, and the article is only for academic sharing.

The event camera of Mr. Zuo Si Zhou Yanwu triggered thinking about framed events. Based on this thinking, we will sort out the event cameras, and the specific algorithm will not be explained. The main reference is <Event-based Vision: A Survey>

The event camera was originally a product of neuromorphic engineering. It mainly used VLSI (Very Large Scale Integration) to simulate and realize certain neuron structures in the brains of humans and other animals. For example, human eyes only focus on relatively moving objects, while Don't care about relatively static objects, the relative motion here is realized by pulse encoding (ie event camera).

There is a big difference between event cameras and traditional cameras. Compared with traditional cameras that collect images at a fixed frequency, event cameras mostly output asynchronous signals (including events, locations, and signs of brightness changes) by measuring the brightness change of each pixel. Event cameras have many advantages over traditional cameras:

  1. There is no concept of frame rate. It is different from the method of sampling at equal time intervals. It can capture faster actions by sampling at equal brightness changes. It is very sensitive to rapidly changing brightness, so it is not affected by motion blur;

  2. The circuit design of a single pixel is relatively complicated, and the brightness value of a single pixel cannot be obtained, and it adopts a logarithmic response method, so a high dynamic measurement range and high time resolution can be obtained;

  3. Reduce data redundancy and greatly reduce data transmission bandwidth;

Due to its many advantages, event cameras are used in object tracking, recognition, gesture control, 3D reconstruction, light slip estimation, SLAM and other fields.

1) The principle of event camera

9ab1056f6a61e9fec2687ceb7b545a43.png

As shown in the figure above, each pixel has a calculation circuit. The photoelectric conversion circuit on the left converts the light intensity into a voltage value, and uses a differential circuit to calculate the change in pixel brightness (this change shows that the event camera depends on the change in brightness, and the brightness The absolute value of the camera is irrelevant), when the accumulation of brightness changes reaches a certain threshold (the threshold is generally an inherent parameter of the camera), the signal will be triggered. The camera uses a special encoding method, Address Event Representation, to compress binary images, and the encoding and decoding of general images will be integrated into the hardware SDK by the supplier.

2) Understanding about events

Event camera, with an emphasis on event understanding. The so-called event has three elements: 1) time stamp, 2) pixel coordinates, and 3) polarity; that is, an event can be expressed as "at what time, at which pixel, the brightness increases or decreases".

When a large number of pixel changes are caused by object movement or lighting changes in the scene, a series of events will be generated, and these events will be output in the form of event stream (Events stream). The data volume of the event stream is much smaller than that of traditional cameras, and there is no The smallest unit of practice. So it has the characteristics of low latency. The following setup to illustrate the event camera:

c2096cddb574e20a2ac8f773d48729a4.png

The above picture can be abstracted as follows. The polarity in the picture can be 1 or -1, which is used to indicate whether the pixel is brightened or dimmed, and events are divided into on and offer. The upper right side of the figure below is the output of the traditional camera, and the lower part is the output of the event camera. As the disc turns, a conventional camera will timed to take a full image, the underlying event camera will only output the change, the movement of the dark spot. When the disc is not moving, the traditional camera will still take images stupidly, while the event camera will not produce any output (traditional cameras are infatuated, and event cameras are scumbags).

5ce0a1f72afcf465af98cbca0d253797.png

When the speed is fast:

d48584de394ee8df7ba86a3524d38f37.png

One of the characteristics of the event camera is that it is fast, and the average time difference between two events is about 0.1us; although in the event camera, the output event is only divided into high and low levels, and the intensity cannot be distinguished, but the number of events triggered by changes in different intensities is different. If you have the conditions to experiment (recommended CeleX5, non-advertising, cheap and convenient), when the camera moves, there will be a bright and dark boundary in the field of view. Due to the contrast between light and dark on both sides of the boundary, the boundary will trigger an event point group of the same shape; Under the shape of , the boundary with greater light and dark contrast will stimulate more events in the same situation, so that rougher intensity information can be obtained. As shown in the figure above, the number of events triggered by low-speed motion and high-speed motion is significantly different.

3) Processing paradigm of event camera

There are generally two methods for event camera processing. One is model-driven, which obtains models from other sensors, such as semi-dense images, and then updates the system status by processing events one by one; the other is data-driven, which uses machine learning methods to Get the correlation between events.

Since the output of the event camera is an event stream, there are three common processing paradigms, namely:

a) Event-by-Event : The Event-by-Event method will be used in many tasks, such as feature tracking, pose tracking and image reconstruction in slam. As mentioned earlier, events often appear in large numbers in places where the dividing line between light and dark is obvious. In vSLAM's Visual Odometry (calculate the pose transformation of the camera by finding the difference between adjacent images, and then obtain the pose of the camera), the features used based on the feature based approach are generally in bright and dark places. Therefore, in vSLAM, the use of event cameras can help filter some useless information and reduce the amount of calculation.

b) Groups of Events : Since each event carries relatively little information and is easily affected by noise, events within a period of time are generally intercepted and processed together, which can generate sufficient signal-to-noise ratio for the problem to be solved. This translates events into traditional camera frames, which can then be resolved using traditional vision methods. In addition, the correlation between events can also be directly used for target detection, optical flow estimation, depth estimation, etc. For example, use the pixel histogram of the event, the surface of the latest timestamp (SAE), and the interpolation voxel network to convert the event into the tensor data used by CNN, and then use an encoder-decoder-like architecture for application (Event-based Vision: A Survey) .

9d3323faf403a1ad30fb055b2d773adb.png

    c) SNN : The full name is Spiking Neural Networks, which is a neural network of pulse events. It takes a small area in the visual space as an input event, which is different from the activation of each iterative propagation of CNN. Only when the state of the event exceeds a certain threshold, it will be activated and output will be generated. Regarding SNN, it is another brand-new field, which will not be introduced here.

4) Thoughts on event cameras

Although event cameras have many features, most of them are still in the laboratory stage and are suitable for occupying pits:

  1. Event cameras have a high frequency (can reach more than 1000Hz), but no matter in the field of drones or autonomous driving, it does not need too high frequency (200Hz is enough). At present, traditional cameras also have a frequency of more than 200Hz, such as the camera frequency of iphoneX can reach 240Hz.

  2. Event cameras can play a greater role in detection and target reconstruction, but there are still great shortcomings in the fields of video understanding and semantic segmentation. In addition, with the development of deep learning, using the relationship between consecutive video frames can also achieve similar event characteristics, such as depth estimation or optical flow estimation.

  3. Event cameras have strong adaptability and stable signals in low-light or high-dynamic environments. They can be combined with traditional cameras, that is, event sensors are integrated into ordinary cameras (such as Dynamic and Active-pixel Vision Sensor, which integrates event sensors, ordinary camera, IMU, etc.), with complementary advantages, it can not only obtain stable signals under low light or high dynamic conditions, but also perform functions such as video understanding and semantic segmentation. This camera can be used in the field of drones or unmanned driving.

Source: Wei Xindiao AI

下载1:OpenCV-Contrib扩展模块中文版教程

在「小白学视觉」公众号后台回复:扩展模块中文教程,即可下载全网第一份OpenCV扩展模块教程中文版,涵盖扩展模块安装、SFM算法、立体视觉、目标跟踪、生物视觉、超分辨率处理等二十多章内容。


下载2:Python视觉实战项目52讲
在「小白学视觉」公众号后台回复:Python视觉实战项目,即可下载包括图像分割、口罩检测、车道线检测、车辆计数、添加眼线、车牌识别、字符识别、情绪检测、文本内容提取、面部识别等31个视觉实战项目,助力快速学校计算机视觉。


下载3:OpenCV实战项目20讲
在「小白学视觉」公众号后台回复:OpenCV实战项目20讲,即可下载含有20个基于OpenCV实现20个实战项目,实现OpenCV学习进阶。


交流群

欢迎加入公众号读者群一起和同行交流,目前有SLAM、三维视觉、传感器、自动驾驶、计算摄影、检测、分割、识别、医学影像、GAN、算法竞赛等微信群(以后会逐渐细分),请扫描下面微信号加群,备注:”昵称+学校/公司+研究方向“,例如:”张三 + 上海交大 + 视觉SLAM“。请按照格式备注,否则不予通过。添加成功后会根据研究方向邀请进入相关微信群。请勿在群内发送广告,否则会请出群,谢谢理解~

Guess you like

Origin blog.csdn.net/qq_42722197/article/details/131278591