[2023 Research and Electricity Competition] The third prize of Anmou Technology Enterprise Proposal: Short-term weather forecast AI cloud analysis system

This article is a sharing of the third prize of the 2023 18th China Graduate Electronic Design Competition Anmou Technology Enterprise Proposal. Participate in the [Prize-winning Event] of the Jishu Community to share the 2023 Research Electronics Competition works to expand your influence, and there will be rich electronic gifts waiting for you. Come and get it! , share the works of the 2023 Research E-Contest to expand your influence, and there are rich electronic gifts waiting for you to receive!

team introduction

Participating unit: Changsha University of Science and Technology
Team name: Star Dream Team
Instructor: Wen Yongjun
Team members: Mei Shuo, Wei Huimin, Wu Jiaxin
Awards won: Third Prize

1 System feasibility analysis

1.1 Research background and significance

Climate disasters are frequent in our country, and accurate forecasts and timely warnings are crucial to society. This research is committed to using deep learning technology, especially self-attention and Transformer models, to perform time series prediction of radar echo images to improve the accuracy and timeliness of extreme weather prediction.
Against this background, this study attempts the application of self-attention and Transformer in two-dimensional image sequences. We combine the convolutional self-attention mechanism with the Transformer encoder to achieve feature aggregation and enhancement of radar echo image sequences. This method not only performs well in terms of temporal features, but also takes into account the extraction of spatial information, helping to better capture the changing features of the sequence.
To sum up, this research has made important progress in the field of extreme weather prediction. Through the innovative application of deep learning technology, it provides a powerful means to improve the accuracy and timeliness of extreme weather prediction.

insert image description here

1.2 Difficulties and innovations in the work

This work studies the challenges of radar echo image extrapolation in short-term now rainfall forecasting. Traditional methods are difficult to capture complex echo changes, so deep learning technology is used, with self-attention and Transformer as the core, to achieve feature aggregation and time series prediction of echo image sequences. The introduction of the convolutional self-attention mechanism enables the model to better understand cloud changes. This method not only enhances temporal features, but also comprehensively considers spatial information, which significantly improves extreme weather prediction. The comprehensive model combines the Marshall-Palmer formula to calculate rainfall forecast. In addition, a web visualization system based on Django was designed to provide real-time rainfall forecast and support national disaster prevention and reduction work.

1.3 Plan demonstration and design

Deep learning is widely used in the field of image recognition, and researchers try to apply it to radar echo extrapolation tasks. According to the characteristics of radar echo images, some people use models such as convolutional neural network (CNN) and recurrent neural network (RNN) for processing. Through methods based on two-dimensional and three-dimensional convolutional neural networks, features are extracted from the time and space dimensions to achieve radar echo extrapolation. In addition, to solve the image blur problem, research introduces generative algorithms, such as generative adversarial networks (GAN) and variational autoencoders (VAE). These methods can better capture the spatiotemporal dependencies of sequence data and improve prediction accuracy.
In this context, the Google team proposed the Vision Transformer (VIT) model, which applies self-attention and Transformer models to computer vision. Inspired by it, this system applies the convolutional self-attention mechanism (CSA) to the Transformer encoding subnetwork for radar echo sequence processing. Through this method, temporal and spatial information are combined to aggregate and enhance sequence features. The processed features are handed over to the Seq2Seq structure based on ConvLSTM to complete the extrapolation task and realize radar echo prediction.
In summary, deep learning is widely used in radar echo extrapolation. Combining convolutional self-attention and Transformer methods to perform feature extraction and modeling in time and space is expected to improve weather prediction accuracy.

2 System detailed design

2.1 System algorithm model design

The work to be done by the algorithm model of this system is to extract features from radar charts in the past period, learn the law of changes in radar charts over time from the features through neural networks, and generate predicted radar images for future periods, and then use ZR based on the predicted images. Relationships lead to specific weather predictions. The algorithm model of this system implements the following three parts:
(1) Processing of radar echo image features
(2) Prediction and generation of future images based on the acquired feature information (3) Optimization
of the accuracy and quality of generated images.
The design mainly constructs a convolutional self-attention echo extrapolation model composed of a feature enhancement network and an echo image extrapolation network. The overall structure of the model is shown in the figure:

insert image description here

2.1.1 Extraction of radar image features

This system uses the convolutional self-attention mechanism (CSA) to build the Transformer encoding subnetwork to process radar echo sequences and extract features from radar echo images. The traditional
self-attention mechanism is mainly used for text sequence processing and word embedding. Compared with the final Embedding vector, when processing input in the form of images, the network needs to consider the spatial information of the input at the same time. The self-attention layer in the form of convolution takes the entire image as the processing object, avoiding the loss of image spatial features caused by simple flattening operations.
Most of the current self-attention calculation methods in the field of computer vision basically follow the nonlocal-module model, that is, stacking two-dimensional images and then sending them to the attention layer. This method of self-attention calculation models the content within a single image. Attention distribution relationship between pixels. But in the sequence field, the self-attention mechanism also needs to model the relationship between sequence time steps. In order to achieve this goal, CSA fixed the minimum computing unit for adding attention operations to a 3D image tensor containing channel, width, and height.

insert image description here

2.1.2 Processing of radar image features

Considering that the sample is an image sequence of continuous time steps, this topic plans to adopt the Seq2Seq model used to deal with sequence problems in deep learning. Multi-layer LSTM is used as the model encoding and decoding network. In the encoding network, the hidden variable Z at each time step is used as the network input to learn its long-term dependence and short-term input influence, and the cell state Ct of the last LSTM cell in each layer is used as the decoding The initial state of the network predicts the hidden variable Z corresponding to the future time step.
(1) LSTM cell: LSTM cell consists of three parts: "forgetting gate", "input gate" and "output gate". As shown in the figure, C represents the cell state of each LSTM cell, and h represents each cell. The hidden state passed out. The forgetting gate receives the cell state and hidden state of the previous cell and decides to what extent it should be taken into consideration; the input gate calculates the input sequence and the forgetting gate operation result to obtain the candidate vector and uses it as the current The internal parameters of the LSTM cell are finally output through the output gate. The LSTM cell structure is as shown below:

insert image description here

2.1.3 Predictive image generation and optimization

In order to solve the problem that the existing predicted images are relatively blurry, a method for optimizing the quality of predicted images is studied. The main research contents include: soft attention mechanism, generative adversarial network discriminator model, constructing a solution for optimizing prediction image quality, and obtaining an output radar echo pattern sequence with higher clarity and more feature information.
Generative Adversarial Network: Generative Adversarial Network (GAN), as an unsupervised learning method, is composed of two independent neural networks: generator G and classifier D. The generator G is used to generate false samples, and the classifier D is used to distinguish whether the samples generated by the generator G are real data or false data. The result of each judgment will be used as the input of back propagation to G and D. If the judgment of D is correct, the parameters of G need to be adjusted to make the generated false data more realistic; if the judgment of D is wrong, D needs to be adjusted. parameters to avoid similar judgments next time.
An error occurred. Training will continue until the two enter a balanced and harmonious state. The ultimate goal of the GAN model is to obtain a high-quality automatic generator and a classifier with strong judgment ability.
The core idea of ​​prediction optimization in this project is: treat the previous prediction generation operation as the generator of GAN, treat the predicted image as fake data, and input it into the classifier D together with the real data of the corresponding time step for game, so as to achieve the goal through reverse Propagation continuously optimizes the intermediate parameters of the prediction pattern generation process, thereby making the resulting radar echo prediction image more credible.
In terms of prediction image generation, the currently widely used models all have a certain degree of blur effect, which affects the accuracy of subsequent rainfall predictions. Since image quality is also an important factor considered when making classifier decisions, the purpose of optimizing image clarity can be achieved in this process. The schematic diagram of the GAN network structure is as follows:

insert image description here

2.2 System front-end framework design

In order to provide users with a better user experience and a more intuitive feeling for the prediction results. The system designed the front-end display interface. The samples and prediction results collected during experiments are directly displayed to users and administrators.
This design uses Django as the system's web application framework. Django is an open source web application framework written in Python. The Django framework adopts the MTV pattern to ensure loose coupling between components.
M represents model (Model): write the functions that the program should have, and is responsible for the mapping of business objects and databases (ORM).
T represents template (Template): responsible for how to display the page (html) to the user.
V represents View: responsible for business logic and calling Mode and Template when appropriate.
In addition to the above three layers, a URL distributor is also needed. Its function is to distribute the page requests of each URL to different Views for processing, and the Views then call the corresponding Model and Template.
MTV's response pattern looks like this:

insert image description here

The overall process of system operation:
The user initiates a request to our server through the front-end interface, and this request is mapped to the corresponding view function through the URL: a.
If no data call is involved, then the view function returns directly at this time A template is a web page for users.
b. If data call is involved, then the view function calls the model, and the model goes to the database to find the data.
After the database obtains the parameters of the sample, use the trained model to predict future radar echo images to obtain the predicted results, and then return them step by step. The view function fills the returned samples and prediction results into the spaces in the template, and finally returns the web page to the user. User operation flow chart:

insert image description here

Summarize

When it comes to sequence modeling and feature extraction, self-attention mechanisms and Transformer models are undoubtedly technologies that have attracted attention in recent years. Although self-attention has made remarkable achievements in the field of text, its application in the field of two-dimensional images is still relatively limited. This model not only attempts the application of self-attention in the field of two-dimensional images, but also combines it with the Transformer encoder, achieving exciting results and achieving effective feature extraction and modeling of radar echo sequences. It brings new ideas and methods to sequence prediction tasks. This research not only enriches the application fields of self-attention, but also contributes useful exploration to the development of the field of sequence modeling.

Participate in the Jishu Community’s [Prize-winning Event] to share the 2023 Research Competition works to expand your influence, and there will be rich electronic gifts waiting for you! , share the works of the 2023 Research E-Contest to expand your influence, and there are rich electronic gifts waiting for you to receive!
For more sharing of research and development competition works, please pay attention to the sharing of IC technology competition works .

Guess you like

Origin blog.csdn.net/weixin_47569031/article/details/132600347