Big data + VR panorama technology reshape the "used car buying scene"

background

1 Introduction

The core problem of second-hand car trading is the opacity of car condition information. China's second-hand car trading market system is not perfect. For a long time, there has been a lack of industry-recognized vehicle valuation standards and vehicle condition inspection standards. The valuation and vehicle condition information provided by second-hand car dealers is not transparent enough. This has caused both users and car dealers to fall into a circular dilemma: users have insufficient trust in car dealers and low willingness to buy; second-hand car dealers lack potential customer clues, and they do not hesitate to use false information to attract customers, further deteriorating the market environment. In order to promote the transparency of vehicle condition information, Autohome used car has continuously improved and optimized the "vehicle history file" , so that the inspection rate of used car accident records has reached 98%, and the maintenance record has reached 85%. The platform carries out offline testing business, obtains real vehicle condition data and improves archive data.

At this stage, various aspects of vehicle information have been integrated at the physical level, but the analysis of semantic content and the visual presentation of information still need to be further studied. Users need to read the collision, maintenance, and battery reports in person to understand the content. The richness, professionalism and readability of the report content will have an important impact on the user's transaction decision. For example, when users browse the APP, they are attracted by the photos of the car's exterior and interior, but they may not be able to accurately understand the many contents contained in the corresponding collision, maintenance, and battery reports because they do not understand the car's body structure and vehicle condition inspection standards. Resulting in transaction conversion failure.

2 goals

Autohome used cars will use digital capabilities and data resources to continuously promote the transparency and standardization of vehicle condition information, making it easier for users to understand vehicle condition information, and improving user decision-making efficiency and lead conversion efficiency. Specifically, Autohome's second-hand car combined machine learning, natural language processing, and VR panorama to reshape the business scene of used car purchases, displaying used car sources in three dimensions: valuation, car history, and VR panorama. The information is integrated and fused, and presented to the user in the form of interactive visualization, so that the user can understand the vehicle condition and valuation of the used car source more quickly, intuitively and in detail, reduce the user's information search cost and information understanding cost, and promote the user to do Make trading decisions.

Figure 1. Comparison of traditional used car buying scenarios and digital used car buying scenarios

As shown in the figure, the traditional second-hand car transaction requires the user to make an appointment with the second-hand car dealer to view the car offline without fully understanding the vehicle information, and then make a subjective judgment based on the experience and knowledge of the car viewer. In the digital second-hand car buying business, users directly obtain standardized vehicle information from the cloud through PCs and APPs, fully understand the vehicle information, make a preliminary evaluation, and then decide whether to view the car offline, effectively improving the efficiency of offline car viewing. In the process of creating a digital experience for users, Autohome's used car not only promotes car purchase transactions, but also improves the business growth of the new car purchase model.

New model of car buying

1 business scenario

Figure 2. Used car buying business structure 

The business process architecture for buying a used car is shown in the figure. The structured data comes from the vehicle data, transaction records and other data of used cars in the used car trading platform of Autohome. Among them, the vehicle data of second-hand cars includes various data such as provinces, cities, models, license plates, mileage, release times, and transfer times, and second-hand car transaction records include transaction prices, transaction types, and vehicle condition inspections. These structured data are used for the training of valuation models to predict the current and future price trends of vehicles.

Semi-structured data refers to vehicle accident records obtained from third parties, 4S shop maintenance records, Tiantianpai offline inspection records, and battery data records. These records have various data types and need to be converted into a unified data format. Analysis The semantic content among them extracts structured information. The battery data of new energy vehicles is processed and analyzed to generate an online battery inspection report, and a multi-dimensional vehicle history report such as maintenance, collision, and battery is obtained comprehensively.

Panoramic data refers to the original image data captured by the VR exterior camera and the VR interior camera. The original image data is generated by the VR shooting component to generate a VR picture, and then displayed through the APP and the VR playback component on the H5 side. The structured information extracted from unstructured data can not only form a car history report, but also perform cross-modal semantic alignment with images in VR. For example, if a car history report mentions "left front door collision", it can The display indicated that the status of the left front door was abnormal. Valuation, car history and VR display will be jointly presented in the user interface.

When users browse the details of used car sources through PC and APP, they can check the vehicle valuation information on the user interface, query the car history report, view the car in VR panorama, and evaluate whether the vehicle is from three perspectives: value, car condition, appearance and interior. Meet the needs, decide whether to buy or leave a car purchase lead.

2 technical difficulties

(1) Valuation: Vehicle data is very complex, usually including region, vehicle age, mileage, model, car series, appearance, interior, vehicle condition and other feature information up to hundreds of dimensions, and there are data for these features The missing part of the model or the complex relationship of multicollinearity between features brings three major challenges to the prediction model of used car prices: the accuracy of model prediction, the computational efficiency of model reasoning, and the interpretability of the model. Although existing machine learning techniques such as neural networks or gradient boosting tree models can process complex features end-to-end, the complexity of vehicle feature data makes such methods unsuitable for the prediction of used car prices. Valuation models are less accurate. In order to solve the above three problems, this valuation model adopts the idea of ​​divide and conquer, grouping the vehicle sources according to provinces, cities and vehicle models, and then quantifying the time-related data in the grouped vehicle source data, and screening them according to the correlation features, train a multiple linear regression model.

(2) VR panorama: The existing VR appearance technology solution is to use a SLR camera + telephoto lens to shoot, and take a 360° shot of the vehicle appearance in a studio with a turntable; Use SLR to shoot on 4 sides, and then use artificial post-processing to complete the generation of panoramic 360° images. The disadvantages are that the SLR + studio + turntable is expensive and the conditions are harsh. The shooting vehicle needs to be transported by a special person, which is inefficient, and the post-production image processing is cumbersome. However, the images obtained through the method of mobile APP guided shooting + post-manual processing are not accurate enough, and post-manual processing takes a long time. The new design and development of the second-hand car VR viewing car is based on models, vehicle contour recognition, gyroscopes, and magnetic field sensors to comprehensively calculate the captured vehicles and venues, and provide photographers with convenient positioning and shooting solutions.

(3) Vehicle history archives: The data of maintenance records, collision records, and battery charge and discharge records also face the problems of huge data dimensions, inconsistent data quality, and lack of standardization. For example, maintenance records and collision records have various data sources, including semi-structured record forms, record documents, and even photographed or scanned document images. These data sources need to be processed. The specification is A data form in a unified format. In the process of extracting vehicle condition information, it is necessary to clarify the type of information to be extracted based on domain expert knowledge, establish a knowledge model for vehicle condition assessment and battery condition assessment and the corresponding standardized terminology vocabulary, and establish a scoring and rating model for vehicle condition and battery.

3 Implementation method

3.1 Valuation

Figure 3. Valuation Model 

Valuation of vehicles is an important part of used car transactions. During the transaction process, it is necessary to evaluate and price used cars based on vehicle information to obtain a more accurate valuation range. At present, we have developed a vehicle valuation model based on Autohome's rich source data of used cars to meet the needs of merchants and users in evaluating the source price of used cars.

The vehicle source data mainly used in our vehicle valuation model includes: geographical area, model, mileage, license plate time, vehicle release time, etc. First, we need to extract the geographical area and model from the vehicle source data, and compare the vehicle according to the geographical area and model Group other dimension data in the source data to obtain the grouped data, and then quantify the time-related data in the grouped vehicle source data, and use the processed vehicle source data as training data to train the multiple linear regression model. The model is defined as follows:

Among them, Y is the estimated price, θ 0 is the intercept, variable t 1 is the time of registration, variable t2 is the mileage, variable t 3 is the time when the user releases the vehicle information, and θ 1 , θ 2 , θ 3 are the corresponding regression coefficients.

 Table 1. Intercepts and regression coefficients of estimation models corresponding to different geographical regions and different vehicle types

Construct multiple vehicle valuation models for different models in each geographical area, that is, each province corresponds to multiple vehicle valuation models, and each province, city, and model corresponds to a vehicle valuation model. Since there are certain differences in vehicle prices in different provinces and models, training different valuation models for different geographic regions and models can effectively reduce prediction errors and make model estimation more accurate. Obtain intercepts and regression coefficients for different vehicle types in each geographic region.

Figure 4. Predicting valuations based on information

Figure 5. Historical deals and recommendations

Therefore, this valuation model is essentially an integrated model, the top layer is a classification model by province, city and vehicle type, and the bottom layer is multiple prediction models for the corresponding categories. When using the vehicle valuation model obtained from the training for valuation, first select the vehicle valuation model corresponding to the geographic region and vehicle model according to the geographic region and vehicle model obtained from the client, and then use the vehicle registration time obtained from the client, the user's release vehicle The information time and mileage are input to the selected model, and the model outputs the corresponding high-accuracy vehicle valuation.

3.2 VR panorama

In the context of the gradual popularization of VR technology, it can provide users with novel content presentation forms. Due to the condition of each second-hand car, VR technology is used to collect the internal and external image data of each car of the merchant, and after the vehicle information is released, it can provide users with a more intuitive and real display of vehicle conditions, 360° display of online vehicle sources, appearance, There are no dead angle details to browse the interior, which improves the browsing experience. Improve user decision-making and lead conversion, and increase the conversion rate to stores. At the same time, it also provides merchants with high-quality clues and user arrival rate.  

Figure 6. VR panoramic shooting technical process

Shooting plan: Load 30 pictures of vehicle models of the corresponding years selected by the user. A set of 360° appearance pictures needs to take 30 pictures from different angles. With the vehicle as the center of the circle and 12° as a point, the station points are divided. The station points are strongly associated with the angles of the model diagrams, and each diagram corresponds to a station point. Using the built-in gyroscope + electronic compass in the mobile phone, after calculation, it can provide the photographer with accurate angle position information, so that the photographer can refer to whether his occupancy matches the model map; through the real-time recognition ability of the image outline, it can provide the photographer with accurate distance guidance , eliminating the cumbersome steps of manual measurement and setting shooting points; when the photographer presses the shooting button, the program analyzes and recognizes the captured pictures, retains the clear pictures of the vehicle inside the vehicle outline, and performs 20% Gaussian on the background area outside the outline The blur layer is generated, and the edges are feathered, and all the layers are flattened to obtain the final appearance map from an angle. This exterior shooting scheme simplifies the manual image processing steps. Through the intelligent recognition algorithm, it can automatically generate the expected appearance picture of the vehicle with clear background and blurred background, which greatly simplifies the 360° shooting process of the vehicle appearance and can be completed within 10 minutes. Exterior and interior shooting, and directly uploaded to the platform for display.

Figure 7. VR panorama multi-platform integration solution

Adapt to multi-terminal shooting and viewing integrated technical solutions: (mobile phone App shooting + App dual-terminal VR playback component + H5VR playback component) 1. Self-developed mobile phone 360° VR exterior shooting App component: 2. Self-developed integrated interior VR shooting component , Support multi-brand VR camera connection and shooting. 3. Self-developed app native appearance player control; 4. Appearance H5 player based on ThreeSixty secondary research and development; 5. Interior 360°H5 interior player based on Kpano.

3.3 Vehicle History Files

Figure 8. Vehicle history report generation

 

Figure 9. Example of a partial vehicle history report

Figure 10. Example of a partial battery report

Vehicle accident records, 4S store repair and maintenance records, and Tiantianpai offline inspection records have various data forms. Part of the image data needs to be converted into a unified document format through OCR, and then the structured information is extracted from the document. Firstly, the knowledge model of vehicle condition assessment and battery condition assessment and the corresponding standardized terminology vocabulary are established to solve the problems of what information needs to be extracted, what is the relationship between information, and how to use the information. Specifically, the NLP model extracts time information, mileage, maintenance/claim amount and other quantitative information, entity information (key parts of the car, such as A-pillar, B-pillar, etc.) and corresponding orientation words (such as front, front left etc.) and verbs (such as cutting, sheet metal, welding, etc.), and establish the relationship between entities, location words and verbs according to syntactic annotations to form a semantic phrase like "left-A-pillar-welding". Such a semantic phrase It is the smallest semantic unit to describe vehicle collision maintenance history. Due to irregularities in the original records or errors in the OCR recognition process, the description of the key parts of the car in the recorded document may not be accurate or complete, and it is also necessary to base on the pre-established standard noun vocabulary, verb vocabulary, and location word vocabulary Perform normalization processing to obtain standardized key part nouns, verbs, and corresponding semantic phrases.

Figure 11. Knowledge model for vehicle condition inspection and classification

Figure 12. Semantic alignment of car history report and VR image

According to the detection parts and event types, the vehicle condition inspection is divided into 8 dimensions: skeleton inspection, reinforcement inspection, blister inspection, fire inspection, mileage inspection, exterior parts, gearbox/engine inspection, and airbag inspection. Among them, the inspection information of appearance parts can be semantically aligned with VR images, and then visually presented at the VR level. According to the standardized relationship between nouns and verbs in key parts, vehicle condition rating rules of different dimensions were formulated, and the extracted standardized semantic phrases were mapped to four levels of "ABCD" ratings, and finally the ratings of the 8 dimensions were combined with the vehicle's accident records and claims. Information such as the amount of money and the guide price of new cars makes a comprehensive evaluation of the condition of the car, which is divided into four grades of "excellent, good, medium, and poor". From the extracted semantic phrases, events and quantity information, the vehicle's collision history details, maintenance history details and historical mileage details are generated.

With the rapid development of the new energy vehicle market, Autohome second-hand cars have also accumulated tens of thousands of new energy vehicle owners and users who desire to buy new energy vehicles. In addition to obtaining vehicle maintenance, collision, and mileage history, new energy vehicle users also have a strong demand for evaluation of battery performance and battery life. To this end, Second-hand Car and BIT Xinyuan have created a new energy second-hand car intelligent vehicle condition cloud platform using the big data of new energy vehicle batteries, which will process and rate battery data, and use it in related products such as Autohome and Used Carhome. Click the button to generate a one-stop online test report for new energy batteries, realizing real-time evaluation of battery performance and online test of cruising range.

The battery inspection report records the battery factory data, and comprehensively checks and evaluates the battery performance on the battery evaluation data, charge and discharge data, driving data and abnormal situation data, and calculates the reference cruising range. Comprehensively analyze the data of the above dimensions, build a battery condition scoring and rating model, predict the battery performance score and divide it into four grades: excellent, good, medium and poor according to the score.

Summarize

Autohome Used Cars conducted in-depth research on used car vehicle data and visual display, and established a standardized data processing process, method model, and visual display form. In the face of massive and complex vehicle data, an integrated model of valuation is established with the idea of ​​divide and conquer, which greatly improves the accuracy of valuation and enables users to accurately understand the value of current vehicles; establishes a standardized vehicle history knowledge model, and uses algorithms to The model and rule method structure the information of collision, maintenance and battery, especially the online test report of new energy vehicle battery, which is in the leading position of innovation in the industry. In terms of visual display, the innovative use of software technology solves the problems of high cost and long time caused by traditional VR technology relying too much on hardware and manpower, enabling merchants to easily shoot 360° panoramic images and improving the browsing experience of car buyers. The three-dimensional information is analyzed and integrated through digital technology, reshaping the business digital scene of buying a used car.

The car buying business of used cars is a very critical business line for used cars of Autohome. In the process of users making transaction decisions, credible and complete vehicle information and the interaction between information and users play a vital role. The vision of Autohome Used Car is to continue to promote the digital transformation of the business, create a fully digitalized system for the circulation of used cars, realize the standardization of non-standard products, make the process transparent, and establish a new model that empowers the digital transformation of the used car industry.

Guess you like

Origin blog.csdn.net/autohometech/article/details/128016967