One of the popular science articles on automatic driving: scene source, scene generalization and extraction

Exchange group |  Enter "sensor group/skateboard chassis group", please add WeChat ID: xsh041388

Exchange group |  Enter the "Automotive Basic Software Group", please add WeChat ID: ckc1087

Remarks: group name  + real name, company, position

Next, we will publish 5 articles on the subject of simulation one after another. Strictly speaking, this is a series of study notes of the author.

The author has been curious about many knowledge points in autonomous driving simulation (the number one is "simulation with real road data") for more than two years, but I have never had the opportunity to learn it before. During the epidemic in April this year, I occasionally got a chance to chat with the founder of a simulation company, and the author took the opportunity to ask him a lot of questions.

Since then, for cross-validation, the author has successively consulted nearly 20 experts in the front line of autonomous driving simulation business.

Experts who provide support for this series of study notes include but are not limited to An Hongwei, CEO of Zhixing Zhongwei, Yang Zijiang, founder of Shenxin Kechuang, Li Yue, CTO of Zhixing Zhongwei, Bao Shiqiang, CTO of 51 World, and simulations of Momo Zhixing, Qingzhou Zhihang, and Cheyou Intelligent experts etc. Thanks for this.

one. Scene sources - from synthetic data to real road data

According to Li Manman, the author of the public account "Che Lu Slowly", and Li Yue, CTO of Zhixing Zhongwei, there are generally two ways of thinking about the source of the simulation test scene:

The first idea is a three-layer system of functional scenarios-logic scenarios-specific scenarios proposed by the German PEGASUS project: 1) Obtain different types of scenarios (that is, functional scenarios) through real road data collection and theoretical analysis; 2) , and then analyze the key parameters in these different scene types, and obtain the distribution range of these key parameters (that is, logical scenes) through methods such as real data statistics and theoretical analysis; 3), and finally select the value of a group of parameters as a Test scenarios (i.e. concrete scenarios).

As shown below:

84433eb1f69512b105d3db81a737c079.png

For example, the functional scenario can be described as, "the self-vehicle (vehicle under test) is running in the current lane, there is a vehicle in front of the self-vehicle accelerating, and the self-vehicle follows the preceding vehicle." The logical scene extracts the key scene parameters, And give the scene parameters a specific value range. For example, in the scene described above, parameters such as the speed of the vehicle in front, the speed and acceleration of the vehicle in front, and the distance between the vehicle in front and the vehicle in front can be extracted. Each parameter has a certain value range and distribution characteristics. There may also be dependencies between parameters. For a specific scene, specific scene parameter values ​​need to be selected to form a scene parameter vector and expressed in a specific scene language.

This is actually the so-called "virtual construction/algorithm-generated scene". Although the understanding of the scene still comes from the real road scene, in practice, it is more based on this understanding to "artificially draw up" a scene in the software. Driving trajectory, a set of scenes, therefore, the data behind this scene is also called "synthetic data".

In practice, the main challenge with this approach is whether the simulation engineer has a deep enough understanding of the normal driving scenarios of the vehicle. If the engineer does not understand the scene and willfully "draws up" a scene, of course it cannot be used.

The second way of thinking is: collect traffic flow data in the predetermined working area of ​​the autonomous driving vehicle, and input these data into the traffic simulation tool to generate traffic flow, and use this traffic flow as the "surrounding traffic vehicle" of the autonomous driving vehicle to realize the test Automatic generation of scenes.

According to Yang Zijiang, the founder of Shenxin Kechuang, in order to ensure a more accurate "true value", usually, the sensor configuration on the engineering collection vehicle is much higher than that of ordinary self-driving cars. For example, the positioning system will use more than 20W equipment and High-beam lidar will produce more accurate data.

Waabi, a simulation company founded by Raquel Urtasun, the former chief scientist of UberACT, is said to use the data collected by the camera for simulation without the need for high-precision sensors such as lidar.

The biggest advantage of using real road data for simulation is that the diversity of scenarios will not be limited by the engineers' lack of understanding of the scenarios. Therefore, it is easier to "salvage" those unknown scenarios that "no one can think of".

In addition, the person in charge of the simulation of an autonomous driving company said: In order to improve the realism of the simulation, we will use as little synthetic data as possible and use more real road data. In fact, the current simulation is already developing in this direction-there are more and more real data and modules.

However, engineers with front-line simulation practice generally reflect that this idea is too idealistic. Specifically, using real road data for simulation has the following limitations:

1. The data needs to be checked manually

In fact, the data collected by the sensor cannot be directly used for simulation - the data type and format need to be converted, there is a lot of invalid data that needs to be cleaned, and valid scenes must be identified from it, and some specific elements need to be marked. Different sensors The data between needs to be synchronized and fused in real time, etc.

Under normal circumstances, the perception data of autonomous driving vehicles does not need to be manually checked, but is directly given to the decision-making algorithm. However, if it is a simulation, manual checking of the perception data is an essential step.

2. The reverse process is more difficult to realize than the forward process

A simulation engineer of an unmanned truck company said: Simulation with synthetic data is a positive process, that is, you first know what tests you need to do, and then take the initiative to design such a scene; simulation with real data is a positive process. A reverse process, that is, you first encounter a problem, and then solve it. Comparing the two, the latter is much more difficult.

3. Unable to solve the interaction problem

Jame Zhang, head of Furui Microelectronics, mentioned in a public sharing that WorldSim (using virtual data for simulation) is like playing a game, while LogoSim (using real road data for simulation) is more like a movie, you can only watch, Unable to participate, therefore, LogoSim naturally cannot solve the problem of interactivity.

4. Unable to do closed loop

Jame Zhang, head of Furui Microelectronics, also mentioned another difference between the two simulation methods: using real road data for playback, the fragments that can be collected are always limited, and often, when the collection starts, danger may have already occurred It's been a while, and it's hard to get the previous data, but if you use virtual data (synthetic data), you don't have to face this problem.

The person in charge of the simulation of an OEM said: "The above-mentioned experts described the acquisition process. Indeed, considering the capacity of the acquisition equipment and the definition of effective scenes, the scenes of acquisition and management have lengths, generally before and after function triggering. Time, especially the cache before the trigger will not be particularly long. On the other hand, when the data is collected and used for refilling, only the scene before the function trigger is valid, but the real scene after the function trigger is invalid of."

The OEM expert said: It is possible to use real road data to train the perception algorithm, but to test the entire algorithm link, it still has to rely on synthetic scene data.

However, the simulation director of the main engine factory also emphasized at the end: "The so-called 'closed loop cannot be achieved' is also relative. There are already suppliers who can complete the parameterization of the elements in the collected scenes, so that the closed loop can be achieved. But the price of such equipment is very expensive."

5. The authenticity of data is still difficult to guarantee

Simulation with real traffic flow data, also known as "recharging".

According to Yang Zijiang, the founder of Shenxin Kechuang, there are two core technologies that need to be used for "recharging": one is to restore the road network structure of the road mining data in the simulation environment, and the other is to integrate the dynamic traffic participants in the road mining data ( Pedestrians, vehicles, etc.) The pose information in different coordinate systems is mapped to the global coordinate system under the simulated world road network.

The tools that need to be used in this process are SUMO or openScenario-used to read in the location information of traffic participants.

A simulation expert of an OEM said: "The refilling of original data cannot guarantee 100% authenticity, because after the original data is injected into the simulation platform, vehicle dynamics simulation must be added. But in this way, whether the scene is still the same as that on the real road The scenes are the same, so it’s hard to say.”

The reason is that the existing traffic flow simulation software often still has the following major defects:

The generated traffic flow is not fidelity enough, often only supports the import of vehicle trajectories, and the two-way interaction between vehicles is not realistic enough;

The data transmission interface between the simulation module (self-vehicle) and the traffic flow module (other road participants) is limited (for example, the road network format is different, and road network matching is required), and third-party operability is limited;

The rule-based traffic flow model is oriented to the evaluation of traffic efficiency, and there may be problems of oversimplification (one-dimensional models are often used, assuming that the establishment is driven along the center line, and the lateral impact is less considered), and it is difficult to meet the requirements of interactive safety evaluation. need.

A Tier 1 simulation engineer said that it is quite difficult to use real traffic flow data to generate simulation scenarios, how to choose a traffic flow model (such as how to define the car-following model and lane-changing model), and how to define the interface of the traffic flow simulation module. At the same time, how to synchronize the data from the own vehicle with the data of other road users will also be a big problem.

6. The universality of the data is low and the generalization is difficult

Both An Hongwei, CEO of Zhixing Zhongwei, and Li Yue, CTO, specifically mentioned the "universality" of simulation data. The so-called data versatility means that the parameters of the vehicle and the scene can be adjusted. For example, when the data is collected by a car, the angle of view of the camera is very low, but after it becomes a simulation scene, the angle of view of the camera can be adjusted higher, and this set of data can be used for the test of the truck model.

If the scene is virtual construction/algorithm generated, each parameter can be adjusted arbitrarily according to needs; then, what if the scene is based on real road data?

The person in charge of the simulation of a tool chain company said that in the case of using real road data for simulation, once the position or model of the sensor is changed, the value of this set of data will be reduced, or even "obsolete".

The simulation experts of Qingzhou Zhihang said that the neural network can also be used to adjust the parameters of real road data. This kind of parameter adjustment will be more intelligent, but the controllability will be weaker.

Using real traffic flow data for simulation, also known as "recharging", and recharging can be divided into two types, direct recharging and model recharging——

The so-called "direct recharge" refers to directly feeding the sensor data to the algorithm without processing. In this mode, the parameters of the vehicle and the scene cannot be adjusted. The data collected by a certain model can only be used for the same vehicle. The simulation test of the model;

"Model refilling" refers to abstracting and modeling the scene data first, and expressing it with a set of mathematical formulas. In this mathematical formula, the parameters of the vehicle and the scene are adjustable.

According to Li Yue, direct recharge does not require the use of mathematical models. "It is relatively simple. Basically, as long as there is big data capability, it can be done." The trajectory and speed of the vehicle are all done through mathematical formulas.

The technical threshold of model recharge is very high, and the cost is not low. A simulation engineer said: "It is very difficult to convert the data recorded by the sensor into simulation data. Therefore, at present, this technology mainly stays in PR level. In practice, each company’s simulation tests are based on scenarios generated by algorithms, supplemented by scenarios from real road sets.”

The person in charge of the simulation of an autonomous driving company said: It is still very cutting-edge technology to use real traffic flow data for simulation. It is very difficult to adjust the parameters of these data (the parameters can only be adjusted within a small range). Because road mining is a bunch of logs and records one by one, it records how the car operates in the first second and second, unlike some scenes edited by humans, which are composed of a series of formulas.

The simulation expert said that the biggest challenge of model refilling is: in the case of complex scenarios, it is extremely difficult to formulate the scenarios. This process can be realized in an automated way, but in the end it is Whether the scene can be used is also a question.

Waymo announced in 2020 that "by directly generating realistic image information from the data collected by the sensor for simulation", ChauffeurNet is actually using a neural network in the cloud to convert the original road data into a mathematical model, and then refill the model. But a simulation expert who has been in Silicon Valley for many years said that this is still in the experimental stage, and there is still some time before it becomes a real product.

More meaningful than refeeding, the simulation expert said, is the introduction of machine learning, or reinforcement learning. Specifically, the simulation system trains some of its own logic on the basis of fully learning the behavior habits of various traffic participants, formulates these logics, and then adjusts parameters in these formulas.

However, according to Li Yue, CTO of Zhixing Zhongwei, and Feng Zonglei, deputy general manager, they have already been able to realize model recharging.

Feng Zonglei believes that whether a simulation company has the ability to refill the model mainly depends on the tools they use and their scene management capabilities.

"In scene management, slicing is a very important part-not all data is valid. For example, in 1 hour of data, the real effective data may be less than 5 minutes. When doing scene management, the simulation company The effective part needs to be cut out, and this process is called 'slicing'.

"After the slicing is completed, the simulation company needs to create a corresponding management environment with semantic information (such as which is a pedestrian and which is an intersection) to facilitate the next screening. Specifically, it is necessary to classify the data slices first, and then Then refine the dynamic target list, and then import it into the model of the simulation environment. In this way, the model has corresponding semantic information. With the semantic information, you can adjust the parameters, and then, the data It can be reused.

"The reason why most companies cannot adjust parameters based on real traffic flow data is because they have not done a good job in scene management."

Yang Zijiang, the founder of Shenxin Kechuang, said: "If you want to generalize the road mining data and maintain the authenticity of the data, you can play back the road mining data at the scene initialization and the beginning stage, and at a certain point the smart-npc model will take over the road The background vehicles in the system will prevent the background vehicles from running according to the road sampling data. After the smart-npc takes over, it records the generalized scenes so that the generalized key scenes can be played back.”

A simulation engineer of an OEM believes that although model recharge sounds "unclear", it is actually "not necessary". The reason is: Modeling the data does not match the original intention of refilling—the original intention of refilling is to want real data, but since the model is modeled and the parameters are adjustable, it is not the most real; time-consuming and laborious, data format conversion Very troublesome and thankless.

The engineer said: "Since you want more scenarios, you can directly use the simulator to generate generalized scenarios on a large scale. You don't have to take the path of modeling real data."

In response, Feng Zonglei responded:

"Using algorithms to directly generate scenes is of course no problem in the early stages of development, but the limitations are also obvious - what about those scenes that the engineers 'unexpected'? Real traffic conditions are ever-changing, and your imagination cannot be limited Give it all.

"More importantly, in the scene imagined by the engineer, the interaction relationship between the objects is often unnatural. For example, if there is a vehicle in front of you, what angle does it insert at? When it is 10 meters away from you, it is still 5 meters away. Time insertion? In the practice of using algorithms to generate scenes, the formulation of scene parameters is often very subjective and arbitrary. Engineers took their brains and came up with a set of parameter injection models, but is this set of parameters representative? "

Feng Zonglei believes that when unmanned driving is still in the Demo stage, virtual scenes generated by algorithms can meet the needs, but in the era of pre-installed mass production, scene generalization is based on large-scale natural driving data (real traffic flow data). Still very necessary.

According to a person who has had contact with Momenta: "Momenta already has the ability to use real road data for scene generalization (parameter adjustment), but their technology is only for their own use and not for the outside world."

Bao Shiqiang, head of vehicle simulation at 51 World, believes that the generalization of natural driving data is still relatively forward-looking, but it will definitely become a very important direction in the future, so they are also exploring.

Summary: The two routes penetrate each other, and the boundaries become increasingly blurred

James Zhang, the person in charge of Furui Micro-simulation, mentioned in a sharing some time ago that there are two methods of Tesla’s simulation: the scene is completely virtual (generated by algorithm) called WorldSim, and the real data playback is called LogSim for the algorithm to see. "However, the road network in WorldSim is also generated on the basis of automatic standardization of data from real roads. Therefore, the boundaries between WorldSim and LogSim are becoming increasingly blurred."

The simulation expert of Qingzhou Zhihang said: "After the real scene data is converted into standard formatted data, it can be de-generalized through rules, thereby generating more valuable simulation scenarios."

Bao Shiqiang, head of 51 World's in-vehicle simulation business, also believes that the future trend is that the two routes of simulation using real road data and simulation using algorithm-generated data will interpenetrate.

Bao Shiqiang said: "On the one hand, using algorithms to generate scenes also depends on the engineer's understanding of real road scenes. The more thorough the understanding of real scenes, the closer the modeling can be to reality. On the other hand, using real road data as scenes, It is also necessary to slice and extract the data (screen out the effective part), then set parameters, trigger rules, and then perform refined classification, and then they can be logicalized and formulated.”

two. Scene Generalization and Scene Extraction

The "parameter adjustment" of scene data mentioned repeatedly in the above paragraphs is also called "scene generalization"-usually mainly refers to the generalization of virtual scenes. In the words of a system engineer of an OEM, the advantage of scene generalization is that we can "create" some scenes that have never been seen in the real world.

The stronger the scene generalization ability of a simulation company, the more available scenes can be obtained after adjusting the parameters of a certain scene. Therefore, the scene generalization ability is also a key competitiveness of a simulation company.

However, Qingzhou Zhihang's algorithm experts said that scene generalization can be achieved through mathematical models, machine learning and other methods, but the key issue is how to ensure that the generalized scene is real and more valuable.

What are the key factors that determine whether a company's scene generalization ability is strong or weak?

Yang Zijiang, the founder of Shenxin Kechuang, believes that a big difficulty in scene generalization is how to abstract the trajectory into higher-level semantics and express it in a formal description language.

A Tier 1 simulation engineer said: It mainly depends on what language (such as openscenario) is used by the simulation tool used by the company to describe different traffic scenarios. details, while being scalable).

There are corresponding scenario languages ​​for functional scenarios, logical scenarios, and specific scenarios: For the former two, there are advanced scenario languages ​​such as M-SDL; for the latter, there are OpenSCENARIO, GeoScenario, etc.

Another level may be the simulation of interference behaviors, the degree of generalization of various driving behaviors and driving "personality".

443cda5c29f024e80f14042a2566fecb.png

△The chart is taken from the book "Autonomous Driving Virtual Simulation Test Evaluation Theory and Method" by Sun Jian, Tian Ye and Yu Rongjie

Yang Zijiang, the founder of Shenxin Technology, said: "Based on the generalization of traffic flow and the intelligence of drivers, if the model is good enough, due to the existence of random factors, running the scene 10 times is equivalent to generalizing 10 times."

However, Li Yue, CTO of Zhixing Zhongwei, believes that generalization cannot be done for the sake of generalization. "We must have a deep understanding of the function under test, and then design a generalization plan, not generalization for the sake of generalization, let alone generalization without bounds. Although scene generalization is virtual, we must also respect Reality."

Another simulation expert also said: "At the end of the day, simulation should serve testing. We have already encountered a problem on the road, and then we will see how to solve it through simulation, instead of saying that I have a simulation technology first, and then Let's see what it's used for?"

A simulation expert mentioned above said that as far as he knows, there are not many companies that can truly achieve the generalization of scenarios. In most cases, parameter adjustment is done manually. "Scene generalization ability is very important, but at this stage, no company can really do it well."

Bao Shiqiang, head of the vehicle simulation business at 51 World, believes that the most important thing for scene generalization is to have a deep understanding of what kind of scenarios are needed for autonomous driving simulation tests. In fact, the problem now is not that there are too few generated scenarios, but too many, and many of them will not actually happen, so they are not considered effective test scenarios. This is caused by a lack of understanding of requirements.

According to some experts, the biggest challenge faced by third-party simulation companies is that they have insufficient understanding of what kind of simulation is required for autonomous driving because they have not personally participated in autonomous driving.

And those L4 autonomous driving companies that are capable and have a deep understanding of simulation requirements do not have enough motivation to generalize the scene very deeply. Because Robotaxi usually only runs in a small area of ​​a certain city, they only need to collect the scene data of this area for training and testing, there is not much need to generalize a lot of them for a long time Scenes that no one will touch.

Bao Shiqiang believes that OEMs like Wei Xiaoli have a lot of real road data, and there is no strong demand for scene generalization. On the contrary, for these companies, what is more urgent than scene generalization is to fine-tune the classification and management of scenes and screen out the truly effective scenes.

The simulation experts of Qingzhou Zhihang also believe that with the increase of fleet size and the rapid expansion of data from real roads, for simulation companies, how to fully mine effective scenarios in these data is indeed much more important than scene generalization . "We may explore a more intelligent generalization method, which can perform large-scale verification of the algorithm faster."

Yang Zijiang said: "Aiming at the generalization at the parameter level, such as the number of lanes, the number of types of traffic participants, weather, and key parameters such as speed and TTC, each company's ability to generate generalized scenarios is similar, but the core of the generalization ability of the scene is It lies in how to identify valid scenes and filter invalid scenes (including repeated and unreasonable ones); and the difficulty of scene recognition is that complex scenes need to identify the relationship between multiple objects."

The above-mentioned "identifying valid scenes and filtering invalid scenes" is also called "scene extraction".

The premise of scene extraction is to first figure out what is a "valid scene". According to several simulation experts, in addition to the scenarios that should be tested according to the law, the effective scenarios also include the following two types: when doing the forward design of the system, the scenarios defined by the engineers according to the development requirements of the algorithm; Can't get it right" scenario. 

Of course, effectiveness and inefficiency are relative, which is related to the development stage of the company and the maturity stage of the algorithm—in principle, as the algorithm matures and the problem is solved, many original effective scenarios will become invalid scenarios.

So how do you efficiently screen out effective scenarios?

There is an idea in the academic community: set some entropy values ​​in the perception algorithm, and when the complexity of the scene exceeds these values, the perception algorithm will mark the changed scene as a valid scene. But how to set this entropy value is a big challenge.

A simulation company adopts the "elimination method", that is, if an algorithm that originally performed very well has "problems frequently" in some generalization scenarios, then this scenario has a high probability of being an "invalid scenario" and can be ruled out.

A system engineer from an OEM said: "At present, there is no good method for scene screening. If you are not sure, then put it on the cloud simulation to calculate. After all, you can calculate these extreme scenarios, and then use these extreme conditions in your own If the verification is done on the HIL bench or VIL bench, the efficiency will be much higher.”

In the writing of this article, a lot of dry goods knowledge from the WeChat public account "Car Road Slowly" was cited. The author of the official account, Li Slowly, is a simulation engineer. This account focuses on sorting out simulation expertise, and recommends friends who are interested in this track to follow.

e6ebc2fe976e5499019ebee0789999ff.png

References:

A Super-Comprehensive Review of Autonomous Driving Simulation: From Simulation Scenarios, Systems to Evaluation

https://zhuanlan.zhihu.com/p/321771761

write at the end

About Contribution

If you are interested in contributing to "Nine Chapters Smart Driving" ("knowledge accumulation and sorting" type articles), please scan the QR code on the right and add staff WeChat.

96bcc152387599a84789de6c1a55c0b9.jpeg

Note: Be sure to note your real name, company, and current position when adding WeChat

And the information about the position of interest, thank you!


Quality requirements for "knowledge accumulation" manuscripts:

A: The information density is higher than most reports of most brokerages, and not lower than the average level of "Nine Chapters Smart Driving";

B: Information needs to be highly scarce, and more than 80% of the information needs to be invisible on other media. If it is based on public information, it needs to have a particularly powerful and exclusive point of view. Thank you for your understanding and support.

Recommended reading:

Nine Chapters - A Collection of Articles in 2021

When a candidate says “I am optimistic about the prospects of the autonomous driving industry”, I will be wary——Review of the first anniversary of Jiuzhang Zhijia’s entrepreneurship (Part 1)

If the data is not collected enough and the algorithm is not iterated fast enough, then "no one likes me"———Jiuzhang Zhijia's first anniversary review (Part 2)

"Real-time" and its influencing factors in vehicle control

Lidar: the war between 905 and 1550

Analysis on the development trend of "cabin-driving fusion" technology

◆Uncover the commercialization status, challenges and trends of autonomous driving in airport scenarios

Guess you like

Origin blog.csdn.net/jiuzhang_0402/article/details/128310400