Wavelets: Seeing the forest and the trees

I. Introduction

On November 15, 1998, Walt Disney Pictures and Pixar Animation Studios released a movie made entirely of computer comics, called "A Bug's life", which is Disney and Pixar Animation Studios. Pixar's second collaboration, like the breakthrough made by the producer Toy Story three years ago, opened up a new perspective. A critic said: "The "Life of a Beetle" has many beautiful visual innovations and intricate details that will make adults and children watch from the beginning to the end; and also from some new, Funny colors refracted from the soft spectrum that hasn't been seen so far."

Only those who know most about computer graphics and often watch movies will come up with many mathematical modeling ideas, which will make it possible to develop all the characteristics in the vivid ant story, not to mention their many structures, their countless expressions, And the way they jump, migrate and sing around together. Although this also appeared, a special modeling technique made it the first application in a movie. One method of computer comics is to take full advantage of the aggregation of a mathematical program called wavelet.

One way to think about wavelets is to consider how our eyes see the world. In the real world, you can observe a forest under different effects and resolutions, just like looking at a picture. For example, from the window of a jet plane flying over the countryside, the forest is A green canopy; from the window of a car on the ground, the "canopy" has become a lot of trees, and if you stop the car and look closer, you will see branches And leaves, and then you use a magnifying glass to look again, you may find a dewdrop hanging on the end of a leaf again. As you keep shrinking the size, you can discover details you haven't seen before. After that, use a photo to do it like before, you will be disappointed. If you zoom in to get closer to a tree, what you see is a blurier tree, and you can't see the branches, leaves, and dew on the leaves. Although our eyes can see the forest at different resolutions, the camera can only take one picture at a time.

Computers are no better than cameras. In fact, their resolution level is very low. In a computer scene, the picture becomes a collection of pixels, these pixels are much rougher than the original.

However, computers everywhere will soon be able to do what photographers dream of. They will be able to display an interactive image of a forest, that is, the observer can zoom in to get trees, branches, and even more detailed leaves. They can do this because wavelets can compress the amount of data used to store an image, and allocate less space for a more detailed image.

Although wavelet has been a research topic for an organization for less than two decades, it has been initiated from many related concepts. During its development, it has experienced more than two centuries. It is constantly being used by those who want to solve their various constraints. The scientist rediscovered the problem. The signal processor opened the way to transmit clear information on the telephone line, and oil prospectors wanted a better way to interpret the traces of the earthquake. At this time, wavelet has not yet become a household word among scientists, until its theory is liberated from many applications and is synthesized into a pure mathematical theory. Conversely, this composite opens up the horizons of scientists discovering new applications. For example, today, wavelets are not only used in computer image processing and imitation, but also used by the FBI to encode three million fingerprint data. In the future, scientists will use wavelet analysis to diagnose breast cancer, look for cardiac abnormalities and forecast the weather.

2. Transforming Reality

Wavelet analysis allows researchers to isolate and manipulate special types of patterns hidden in numerous data. Our eyes can pick out trees in the forest in the same way, or our ears can distinguish the sound of flutes in a symphony. . One way to understand how wavelet does this is to start between two different sounds, such as the pitch of a fork and a human voice, find the difference, and then hit the fork, you will hear a pure tone that lasts for a long time. . In mathematical theory, such a tone is called frequency localization, and it consists of a single note without a higher frequency tone. In contrast, what a person says lasts only one second, so it is called localization in the time domain. It is not localized in the frequency domain because the words spoken are not a single tone, but a combination of tones of many different frequencies. Tone together.

In the 19th century, mathematicians believed that the pitch of the fork in reality was perfect. This theory is known as Fourier analysis. Jean Baptiste Joseph Fourier, a French mathematician, claimed in 1807 that any repetitive waveform (or periodic function), like the sound wave emitted by a fork, can be represented by an infinite combination of sine waves and cosine waves of various frequencies.

A familiar exposition of Fourier theory takes place in music. When a musician played a note, he created an irregularly shaped sound wave, a wave of the same shape, as long as the musician continues to play the note will continue to repeat. Therefore, through Fourier, this note is decomposed into the sum of a sine wave and a cosine wave. The wave with the lowest frequency is called the fundamental frequency of the note, and the wave with the highest frequency is called the overtones. For example, if the A note is played on a violin or flute, there is a fundamental frequency with a period of 440HZ and a period of 880HZ, 1320HZ. Etc. overtones, even if you use the violin and flute to play the same note, they will sound different because their overtones have different strengths or amplitudes. As stated by music synthesizers in the 1960s, a convincing imitation of a violin or flute can be accomplished by recombining pure sine waves of appropriate amplitude. Of course, this was predicted by Fourier in 1807.

Later, mathematicians extended the Fourier idea to aperiodic functions (or waveforms), which would change over time, instead of repeating the same shape. The most realistic waveform is this type: a motorcycle accelerates first, then decelerates, making sounds like this constantly reciprocating. The same is true in images. The distinction between repetition and non-repetition is important. A repetitive pattern may be regarded as a texture or background, and the non-repetitive pattern may be regarded as an object and selected. Periodic or repetitive waves consist of a series of discrete overtones, which can be used to represent repetitive patterns in an image. Aperiodic features involve more complex frequency spectra, called Fourier transforms, just as sunlight can be separated into spectra of different colors. The Fourier transform portrays the structure of a periodic wave in a much more revealing and concentrated from than a traditional graph of a wave would. For example, the sound of a motorcycle in the Fourier transform will show A peak.

Fourier transform has been paid attention. In the 19th century, Fourier solved many problems in physics and engineering. Its dominance makes scientists and engineers use it as the perfect method to analyze any phenomenon. This general view forced a closed inspection of this method. As a result, in the 20th century, mathematicians, physicists, and engineers discovered Fourier’s flaws, that is, the Fourier transform has difficulty in reconstructing instantaneous signals and signals with sudden changes, such as spoken words or the sound of tapping drums. Music synthesizers still cannot match the violinist's performance in the concert hall, because the violinist's performance contains transient features, such as the contact between the bow and the string, which is difficult to express with sine waves.

The principle in this problem can be explained by the famous Heisenberg uncertainty principle . In 1927, physicist Werner Heisenberg believed that theoretically the position and velocity of an object could not be accurately measured at the same time. In terms of signal processing, it means that it is impossible for a signal to know a certain precise frequency and the precise moment when the frequency occurs. In order to know its frequency, the signal must be extended in time, and vice versa. In the form of music, it means that any signal with a short duration must have a complex spectrum, which is composed of various sine waves. On the other hand, any signal that is simply synthesized by some sine waves must have a complex presentation in the time domain. Therefore, we cannot expect to recreate the sound of the drum with the orchestra of the fork.

3. An idea with No Name

Throughout the 20th century, scientists in different fields have struggled to break free of these limitations in order to allow the expression of data to fit the nature of information. Essentially, they want to obtain the forest at low resolution, that is, repeated background signals, and the tree at high resolution, that is, individual local variables in the background. Although scientists are trying to solve this problem, especially in the field of their research. They began to come to the same conclusion that the Fourier transform itself is flawed; they also have a common solution, which is to divide the signal into components without pure sine waves, which may condense information in the time domain and frequency domain. This is what will be called wavelet.

The first to enter the wavelet was a Hungarian mathematician named Alfred Haar, who in 1909 invented the function now called the Haar wavelet. These functions consist of simple short-term alternating positive and negative pulses. Although the short-term pulses of Haar wavelets teach well as wavelet theory, they are useless for most applications because they have obvious jump lines rather than smooth curves. For example, the image reconstructed by Haar wavelet looks like a cheap calculator, and the Haar wavelet reconstruction of a flute sound is too harsh.

In the following decades, other pioneers of wavelet theory published some theoretical articles from time to time. In the 1930s, British mathematicians John Littlewood and REAC Paley invented a method of grouping frequencies through octaves, which can create a signal with good frequency localization (its spectrum lies between an octave), and It is also relatively localized in the time domain. In 1946, Dennis Gabor, a British-Hungarian physicist, invented the Gabor transform, which is similar to the Fourier transform. It divides a wave into time-frequency packets or adjacent states, maximizing the locality in both time and frequency.化. In the 1970s and 1980s, the signal processing and image processing organizations invented their own version of wavelet analysis, which was then called "subband coding", "integral mirror filter" and "pyramid algorithm".

Although not exactly the same, all these technologies have similar characteristics. They decompose or convert the signal into components that can be localized at any time interval. They can also be stretched or connected together to analyze signals at different resolutions. These wavelet Pioneers have another thing in common, that is, no one knows them except for individual specialized organizations. But in 1984, wavelet theory finally appeared.

4. The Great Synthesis

Jean Morlet is not going to start a scientific method research, he is just trying to help geologists find a better way to explore oil.

Petroleum geologists usually locate underground oil extraction sites through loud noises. Because sound waves travel through different materials at different speeds, geologists can infer which materials below the surface can transmit seismic waves to the ground and measure how fast they reflect. If this wave energy passes through a layer very quickly, it may be a salt bag that wraps the underground oil layer.

To describe how geologists transmit a sound wave is a clever mathematical problem. Engineers usually use Fourier analysis to solve this problem. Unfortunately, seismic signals contain a lot of transients, like sudden changes in waveforms that propagate from one rock layer to another. These transient quantities contain the information that geologists are looking for, usually the location information of the rock layers, but Fourier analysis will extend the spatial information to all locations.

Morlet, an Elf-Aquitaine engineer, invented his own method of analyzing seismic signals. It created components positioned in space. He called them "wavelets of constant shape" (wavelets of constant shape), and they were famous for Morlet wavelets. Whether these components are stretched, compressed, or translated in the time domain, they all maintain the same shape. Other families of wavelets can be constructed with a different shape, called a mother wavelet, which can be stretched, compressed and translated in the time domain. The researchers found that the precise shape of the mother wavelet seriously affects the accuracy and compression characteristics of the approximation. Many differences between early versions of wavelet can be simply attributed to the different choices of alma mater package.

Morlet's method is not in the textbook, but it seems to be effective. On his personal computer, he can divide a waveform into wavelet packets and then reassemble them into the original waveform. But he was not satisfied with the empirical evidence, and began to ask other scientists whether this method had a mathematical basis.

Morlet found the answer he wanted from Alex Grossmann, a physicist located in the Centre de Physique Théorique in Marseilles. During this year, Grossmann and Morlet worked together to verify that waveforms can be reconstructed from their wavelet decomposition. In fact, the wavelet transform has proved to be superior to the Fourier transform because the wavelet transform is not very sensitive to errors in calculations. The truncation of the Fourier coefficients or an error can transform a smooth signal into a jumping signal, and vice versa, wavelet can avoid this catastrophic result.

Morlet and Grossmann first used the word "wavelet" in an article published in 1984, when Yues Meyer of Cole Normale Supérieure de Cachan, who was one of the most respected discoverers of wavelet theory, heard about it this fall The previous two articles. He was the first to recognize the connection between Morlet wavelets and early mathematical wavelets, such as those in the works of Littlewood and Paley. (Indeed, Meyer had calculated the reproduction of 16 separate wavelet concepts before the publication of Morlet and Grossmann's article)

Meyer went on to invent a new type of wavelet with mathematical orthogonality, which makes the wavelet transform easy to operate like the Fourier transform. (Orthogonality means that the information captured by one wavelet is completely independent of the information captured by another wavelet.) Perhaps more importantly, it has become a link into the wavelet organization.

In 1986, Meyer's student Stéphane Mallat, who was pursuing a PhD in computer science at the time, connected wavelet theory with the existing subband coding and integral mirror filters to form a wavelet version of the image processing organization. The idea of ​​multi-resolution analysis is to look at signals at different scales of resolution, which is already familiar to experts in the field of image processing. With the help of Meyer, Mallat obtained the concealment of wavelet in the process of multi-resolution analysis.

Thanks to Mallat's work, wavelet has become easier. A person who does not know the mother wavelet formula can do wavelet analysis. The analysis process is simplified to a simple operation, that is, the pixels are equally grouped together, and the differences between them are repeatedly found. Wavelet language is also more convenient for electronic engineers who are familiar with vocabulary such as "filter", "high pass frequency", and "low pass frequency".

The last great part of the wavelet method was started in 1987. At that time, Ingrid Daubechies was visiting the Courant Institute of New York University, and then went to the Bell Labs of AT&T to attend appointments. She invented a new type of wavelet, which not only Intersection (like Meyer wavelet), and can be realized with a simple digital filter point of view. In fact, it is a short digital filter. This kind of new wavelet can be programmed and used the same as Haar wavelet, but there is no Haar wavelet. Jump. Today's signal processors have an ideal tool, which is a method of decomposing numbers or data into contributions at various scales. Combining the ideas of Daubechies and Mallat, this has a simple orthogonal transformation, which can be calculated quickly with modern digital computers.

Daubechies wavelets have surprising properties, such as being closely related to fractal theory. If you zoom in on their waveforms, no matter how much you zoom in, you can see characteristic jagged wiggles (characteristic jagged wiggles). The exquisite complexity of the details means that there are not simple expressions for these wavelets, they are ugly and asymmetrical, and the mathematicians of the 19th century would refute them from horror. But like Model-TFord, they are useful and beautiful. Daubechies wavelets transform theory into practical tools that can be easily programmed and can be used by any scientist with little mathematical training.

5. How does wavelet work?

So far, the most popular application of wavelet ("Killer app") has been digital image compression. They are the core of the new JPEG2000 digital image standard, and the WSQ (wavelet scalar quantization) method is used by the FBI to compress fingerprint databases. In this, the wavelet is considered to be the brick used for image construction. A forest image can be obtained from the widest wavelet: a row of green in the forest, a touch of blue in the sky. In more detail, sharper wavelets can help distinguish trees, and finer wavelets can be used to add tree trunks and branches to the image. Like a single brush used for painting, each wavelet is not the image itself, but many wavelets together can reconstruct anything. Unlike a brush in painting, a wavelet can be made arbitrarily small; a wavelet has no physical size limit because it is stored as a series of simple 0s and 1s in the computer memory.

Contrary to popular belief, wavelets cannot compress an image by themselves. Their job is to make compression possible. To understand why, suppose an image is encoded by a series of spaced numbers, such as 1, 3, 7, 9, 8, 8, 6, 2. If each number represents the brightness and darkness of a pixel, with 0 representing white and 15 representing black, then this string represents a certain type of object (7, 8) under a brightness background (1, 2 and 3). And 9).

The simplest form of multi-resolution analysis filters the image by averaging the values ​​of each adjacent pixel. In the above example, the string result is 2, 8, 8, 4: a low-resolution image still shows a gray-scale object on a bright background. If we want to reconstruct a degraded version of the original image from here, we need to repeat each number, which is 2, 2, 8, 8, 8, 8, 4, 4.

However, suppose we want to reproduce the original image perfectly. In the first step, we must save some additional information, which is a set of numbers that can be added or subtracted from the low-resolution signal to obtain the high-resolution signal. In this example, those numbers are -1, -1, 0, 2. (For example, if -1 is added to the first pixel of the degraded image, the first pixel of the original image is 1; subtracting -1 from the second pixel of the degraded image is the second pixel of the original image)

Therefore, the first stage of multi-resolution analysis divides the original signal into a low-resolution part (2, 8, 8, 4) and a high-resolution part or detailed part (-1, -1,0, 2). This high frequency detail is also called Haar wavelet coefficient. In fact, the entire program is a multi-resolution version of Haar wavelet transform invented in 1909.

It seems that the first step of wavelet transform has not been considered. There are 8 digits in the original signal, and there are still 8 digits in the transformation. But in a typical digital image, most pixels are very similar to their neighbor pixels: Sky pixels will occur next to sky pixels, forest pixels next to forest pixels. This means that the average value of neighboring pixels will most likely be the same as the original pixel value, so most of the detail coefficients will be 0, or very close to 0. If we simply approximate those coefficients to 0, the only information we need remains on the low-resolution image plus some detail coefficients that are not approximated to 0. Therefore, the amount of data needed to store images has been compressed by nearly half. The process of approximating high-precision numbers to lower-precision with fewer numbers is called quantization (Q in WSQ)

The transformation and quantization process can be repeated many times, each time by a multiple of 2 to reduce the number of information bits, and smoothly reduce the image quality. According to user needs, this process can be stopped before lower resolution starts to appear, or it can be continued to obtain ultra-low resolution images with increasingly precise details. With the JPEG2000 standard, the image quality can be compressed to 200:1 without visual changes. Such wavelet decomposition can be obtained by averaging more than 2 neighboring pixel values ​​at a time. For example, the simplest Daubechies wavelet transform combines a group of 4 pixels, and uses 6, 8 or more to smooth one.

A fascinating feature of wavelets is that they can automatically pick out the same features as our human eyes. Those wavelet coefficients left after quantization correspond to pixels that are particularly different from neighboring pixels, such as the edge of an image. Therefore, most wavelets recreate images from the edges of pictures, which are drawn when people draw a picture. Indeed, some researchers pointed out that the similarity between wavelet transform and human vision is not accidental. It is our nerves that filter visual signals in a similar way to wavelet.

6. The future of wavelet

With the establishment of the basis of wavelet theory, this field will develop rapidly in the following time. The list of research on Xiaobo went from 40 in 1990 to more than 17,000 contributors to the current online newsletter. Moreover, it will continue to evolve through the combination of theory and practice. Engineers are constantly trying new applications. For mathematicians, there are still important theoretical questions to be answered.

Although wavelets are well-known in image compression, many researchers are interested in using wavelets for pattern recognition. For example, in weather forecasting, they can reduce the huge amount of data brought by computer models. Traditionally, these models sample air pressure at huge data table points and use this information to predict how the data will change. However, this method takes up a lot of computer memory. A climate model using a 1000×1000×1000 grid requires one billion data points, and this is still a very crude model.

However, most of the data in the grid is redundant. The atmospheric pressure in your town may be the same as the atmospheric pressure one mile away. If wavelets are used in weather models, they can observe data in the same way as weather forecasts, focusing on those areas that have significant changes. Other problems in fluid dynamics can also be solved in the same way. For example, at Los Alamos National Laboratory, wavelets are used to study the shock waves produced by a bomb explosion.

As described in the recent computer comics movie, Xiaobo also has a bright future in the movie. Because wavelet transform is a reversible process, it can easily synthesize an image and analyze it easily. This view is related to a new computer comic method called Subdivision Surfaces, which is basically the use of multi-resolution analysis in the inverse transformation: in order to draw a cartoon character, the cartoonist only needs to specify some key points of movement. , You can create a low-resolution character. Then the computer does reversible multi-resolution analysis, making the character look like a real person instead of a dull picture.

Subdivision Surfaces was used in the 1998 movie "The Life of a Beetle", replacing the clumsy method of NURBs. The NURBs method was used in a Toy Story movie in 1995. Interestingly, these two methods were used in Toy Story2 in 1999. The characters in Toy Story1 retained NURBs, but the new characters were based on Subdivision Surfaces. The next area of ​​Subdivision Surfaces may be video games, where they all eliminate the blockiness in today's images.

At the same time, in terms of theory, mathematicians are still looking for better forms of wavelets for two-dimensional and three-dimensional images. Although standard wavelet methods are good at picking edges, they are done one pixel at a time, which is inefficient for expressing some very simple curves or straight lines. David Donoho and Emmanuel Candès of Stanford University proposed a new type of wavelet, called "ridgelets", specifically designed to detect discontinuities along the line. Other researchers are studying multiwavelets, which can be used to encode multiple signals transmitted on the same transmission line, such as three color values ​​in a color image when they are transmitted at once.

When asked about the value of judging mathematics, mathematicians often point out that concepts developed to solve purely mathematical problems will have unpredictable applications after a few years. But wavelet’s story paints a more complex and interesting picture. In this case, specific applied research will produce new theoretical synthesis, which in turn will broaden the horizons of scientists to develop new applications. The broader meaning of wavelet is that we should not regard basic applied science as the end of research. Good science requires us to see both the theoretical forest and the trees in practice.

Guess you like

Origin blog.csdn.net/itnerd/article/details/109103165