Big data combined with artificial intelligence to help the construction of smart cities

PaddleSpatial is a spatiotemporal big data computing tool and platform developed based on Baidu's Paddle Paddle Deep Learning framework. It integrates Baidu's leading spatiotemporal data processing capabilities such as regional segmentation, time series, and city migration learning. This time, Zhou Jingbo, senior researcher of Baidu Research Institute and technical director of PaddleSpatial, will share with you how PaddleSpatial realizes the combination of deep learning and spatiotemporal big data to help the development of smart cities.

This article mainly covers four parts:

  • Baidu's big data and artificial intelligence technologies help build smart cities
  • Introduction to PaddleSpatial Open Source Algorithm
  • Application of Urban Cognitive Computing in Smart Cities
  • Exploration in intelligent transportation

1. Baidu big data and artificial intelligence technologies help build smart cities

First, let's talk about the methodology of our project development. We can use a triangle to represent the relationship between big data, artificial intelligence and smart cities. The advantages of Baidu's big data are reflected in many aspects. Baidu is the only artificial intelligence company in China that has both massive search data and map data. Baidu is the largest search engine company in China, and Baidu Maps is also the most widely used electronic map mobile app in China. It is these massive dynamic spatiotemporal big data and Internet big data that realize the digitization of the city, allowing us to feel the changing laws of the city more quickly and directly. This provides the necessary data foundation for better construction of smart cities. At the same time, big data also fuels artificial intelligence and is essential for training machine models. In this way, big data and artificial intelligence jointly provide the necessary data and technical support for the construction of smart cities.

2. Introduction to PaddleSpatial open source algorithm

The purpose of PaddleSpatial's research and development is to make full use of Baidu's spatiotemporal big data capabilities and artificial intelligence technology to build open source tools for basic algorithms to support the application of upper-level cities.

At present, the main algorithm modules of PaddleSpatial open source include regional segmentation, community attribute prediction, spatial graph neural network, urban migration learning and urban time series data prediction, etc. At the same time, we have developed some related urban computing applications at the upper level, including regional portraits, travel portraits, urban quantitative analysis, intelligent transportation, and more. Below I will briefly share them one by one.

First of all, let's introduce several algorithm modules with bright spots in PaddleSpatial.

The first is a spatial graph neural network model that supports relationship prediction of spatial points. The graph neural network has made great progress in the past two years, but there is a problem that it usually treats the graph as a topology graph, and models the relationship between nodes. Actually in geographic data, the relative positions and angles of points and points are very important. How to model the distance and angle between two points is a problem that the current graph neural network cannot handle well. Therefore, we propose a spatially adaptive graph neural network algorithm. It integrates the relative distance and angle information between two points in geographic space into the graph neural network framework.

The second function is the segmentation of urban areas. It mainly divides the city into subdivision units based on road network data, and then takes this basic unit as the research object. Compared with the application in natural language processing, urban area segmentation actually defines the "word segmentation" ability in the urban space domain. Compared with the existing algorithms, it can be seen that the accuracy of the open source urban area segmentation method of PaddleSpatial is much higher than that of the existing methods. PaddleSpatial can divide the country into 1.5 million blocks, providing a solid foundation for further urban research.

After completing the segmentation of urban areas, the next step is to study various attributes of urban areas, such as population, housing prices, and population distribution. Here we also develop a sparse label prediction algorithm for urban areas. A typical application of sparse labels for urban areas is urban village detection. Urban villages exist in many cities, but the number is very sparse. Through the algorithm we study, the detection of urban villages in the entire city can be achieved. Preliminary experimental results also demonstrate that our method significantly outperforms the state-of-the-art methods.

After modeling the blocks, the relationship between people flow between blocks will also be studied, such as how much traffic is between different blocks. We also specifically propose a model for traffic forecasting between urban areas. The innovations of this model include a variational directed graph autoencoder, a priori distribution alignment for urban multimodal information fusion, and a Poisson-based decoder. In the existing regional traffic data, very good results have been achieved.

3. The application of urban cognitive computing in smart cities

Earlier, I focused on sharing PaddleSpatial's characteristic algorithm tools. Later, I will focus on the ongoing urban cognitive computing-related projects in our laboratory. Our goal is to build cognitive computing capabilities in urban spaces to better understand cities and benefit urban residents.

We first introduce region portraits. Regional portraits include crowd portraits, life maps, frequently visited areas, regional indices, land use distribution, and functional distribution. Regional portraits have more fine-grained analysis and identification capabilities, can dynamically perceive the distribution and characteristics of human flow between regions, and conduct real-time analysis of urban functional areas. The coverage of the regional portrait is very fine, and it can support the scaling of the 5-level spatial dimension from the province to the street. The relevant data can support the analysis of user characteristics and functional facilities covering the whole country.

At the same time, related to the regional portrait, we also established the travel portrait of the city. Unlike regional portraits, travel portraits take the OD pair formed by each departure and destination as the basic research unit, and describe the movement attributes of the crowd in a fine-grained manner. Through massive historical population data, quantitative index analysis and predictive ability modeling are carried out to improve the city's perception ability.

Here is the system we built for regional portraits and travel portraits. First of all, let’s look at the travel portrait. For each OD pair formed by the origin and destination, we can achieve fine-grained crowd analysis, and display the dimensions of traffic and transportation modes of different people on the same path. The regional portrait can realize the analysis of the population and attributes of an area. We can build a life map of each area and analyze the demand and supply of food, clothing and travel in that area. We can also analyze frequently visited places in the area, observe different levels of origins and destinations, etc. Then, we also constructed fine-grained regional indexes, including convenience index, quality of life index, etc. We can also analyze the land use and function distribution of the area. The land use distribution reflects the planning and positioning of the area, such as residential land and educational land; while the function distribution reflects the spontaneously formed service functions in the area, such as leisure and entertainment, catering and other functions.

We will also further study the relationship between cities and learn the known laws of cities and apply them to other cities. During the 2020 epidemic, our laboratory and map cooperated to complete a project to identify high-risk communities for COVID-19 infection. This project combines multimodal learning and urban area transfer learning technology to design an identification algorithm for urban areas with high risk of COVID-19 infection. When the new crown epidemic first broke out last year, only Wuhan started a large-scale epidemic. By looking at the characteristics of the communities where large-scale outbreaks occurred in Wuhan, we can extract patterns and guide other cities. The model we developed can locate the causes of high-risk communities of the new crown epidemic, prompting the government to take targeted optimization measures for different areas, helping the government to improve its epidemic control capabilities.

The platform capabilities built by PaddleSpatial have achieved exemplary applications in cooperation with the Xiong'an New Area Management Committee, the United Nations Development Programme and the Beijing Jiaotong Research Institute. Among them, we completed the Xiong'an big data report and was reported by more than 50 authoritative reports such as People's Daily Online, The Paper, China News Network, Hebei Satellite TV, etc. At the same time, we also jointly built the "China Happy City Laboratory" with Xinhua News Agency Lookout, which will continue from 2018. Provided technical support for the ranking of happy cities in China led by Xinhua News Agency, which has produced extensive technical influence. Recently, we have also extended the capabilities of CityCube to support intelligent transportation, and we are committed to making the urban transportation operator model grow and develop, and we have achieved good results.

We designed the country's first urban happiness index framework based on big data and artificial intelligence. The whole frame is designed as 9+X structure. Among them, "9" refers to the nine major first-level indexes, which include more than 100 subdivision indicators, and strive to cover every dimension and subdivision of residents' clothing, food, housing and transportation. For example, taking the "quality of life" index as an example, it covers 8 secondary indicators such as "level of cultural, sports and leisure, green space occupation per capita", and each secondary indicator is further subdivided into more tertiary indicators.

Last year, our index framework also added an X index to cover the hotspots of current affairs every year. The X index in 2020 is a "big data anti-epidemic index" developed based on the "new crown epidemic" event. Compared with traditional questionnaire surveys and statistical methods, Baidu City Happiness Index has the advantages of scientific, innovative and comprehensive.

PaddleSpatial's capabilities can also assist us in judging and analyzing urban development trends, making decisions for city managers, and providing data and technical support. Recently, we have supported urban big data reports in cities such as Xiongan and Wenzhou. In particular, Xiong'an's big data report, which has been done since 2018, has been done for the 4th consecutive period, which has generated extensive media influence.

4. Exploration in intelligent transportation

Finally, I briefly summarize some of our explorations in intelligent transportation. We developed an industry-leading trajectory restoration module in the first half of this year. According to the data captured at different intersections, the complete trajectory of the vehicle in the city is reproduced. As a highlight function of Baidu's traffic brain, it has been launched on the traffic brains of Baoding and Yizhuang cities. In addition, combined with the existing urban traffic network and traffic forecast, we also cooperate with the traffic control department to explore the related work of road planning and traffic hub construction optimization.

In short, we hope to continue to build and optimize PaddleSpatial to achieve the vision of using Baidu AI to make cities smarter. Thank you for your attention!

Guess you like

Origin blog.csdn.net/weixin_45449540/article/details/123437395