Crawling more than 20,000 Flickr images, Monash University reproduced the time and space characteristics of Japan's cherry blossoms in the past 10 years

35efbf4264872e3a7ff884af566b3776.png

本文约3000字,建议阅读6分钟本论文中提出的 SNS 分析技术,可以填补公开数据中缺失的部分。

Content overview: In recent years, the situation of global climate change has been severe, and the butterfly effect triggered by it is profoundly affecting human beings and nature. In this context, collecting data on flowering patterns within hundreds or even thousands of kilometers to understand how climate change affects flowering plants has become one of the important topics of ecological research in recent years. However, traditional methods usually cost a lot of money and take a long time to conduct sampling surveys, and logistical support is also faced with many difficulties. The study, recently published in the journal Flora, not only overcomes these problems, but also reveals unprecedented details.

Key words: spatio-temporal analysis smart ecology SNS data

As the national flower of Japan, cherry blossom has an important position in Yamato culture. Flower viewing (Hanami, Hanami), as a very distinctive folk custom, has a history of hundreds of years. However, Japan spans about 20 degrees of latitude, and the country can be divided into 6 climate zones. The climates of different regions are significantly different , so the time for cherry blossoms to bloom is also different. Every cherry blossom season, Japanese travel websites will also display the flowering conditions in various places in detail for tourists to arrange flower viewing time. In recent years, due to the impact of climate change, the opening time of Japanese cherry blossoms is also advancing.

In order to explore the flowering pattern of Japanese cherry blossoms and understand the impact of climate change on phenology, the research team of Monash University in Australia used Python API and computer vision API to monitor the opening of Japanese cherry blossoms through social networking site (SNS) data, and analyzed the The experimental results are compared with the real situation. The research has been published in the journal Flora with the title "The spatiotemporal signature of cherry blossom flowering across Japan revealed via analysis of social network site images".

b3813c86429e0a4b67923cd89ce84205.png

The research results have been published in the journal "Flora"

Paper address:

https://www.sciencedirect.com/science/article/abs/pii/S0367253023001019

 Experimental process: crawling, filtering and analysis of data sets

 data set 

The process of collecting cherry blossom opening data in this experiment can be divided into two steps:

1. Extract image data from social networking sites, including several different sequential stages

2. Use computer vision API and manual verification methods to filter the data for correlation

Considering that the API needs to filter time, space, and text at the same time, the researchers chose Flickr as the data source. First, use the Python API client to collect related images with geographic coordinates on Flickr by searching for the keyword "cherry blossom".

Second, set the Bounding Box to 31.186°N-46.178°N, 129.173°E-145.859°E to ensure that the picture is taken in Japan. The time frame was set to 2008-2018 to exclude the impact of the global tourism decline due to COVID-19 on the data.

The researchers then filtered the data by masking Japan's geographic boundaries obtained from gadm.org, resulting in 80,915 images.

223722f74bebbc45a97d3cbf8d8fe954.png

January 1, 2008 to December 31, 2018

Search for images of "cherry blossom" located in Japan on Flickr

January and February (blue) represent the first blooming of cherry blossoms before spring;

March-May (green) indicates the concentration of photographic data recording the main cherry blossom blooming period in spring;

October-December (pink) shows an interesting phenomenon that peaks in autumn especially in November.

Although Flickr's images are restricted by the search keyword "cherry blossom", SNS content may still be incorrectly associated with the search term, so verification is required.

In this regard, the researchers submitted all the images to Google Cloud Vision AI . The API can generate descriptive text labels for each image based on its visual content, thereby automatically double-checking the relevance of individual data points.

Google Cloud Vision AI uses pre-trained machine learning models to assign labels to images in predefined categories. In addition, the researchers performed additional manual checks on the sample data, as shown in the table below:

2538bf713f575db93ba6283c8cbbbd1f.pngTable 1: Tokyo-filtered dataset, image data at each stage

Column B: A Flickr search for "cherry blossom" returns 28,875 images, all of which are located within the administrative area of ​​the Tokyo area

Column C: Text labels returned by the Computer Vision API for this dataset and their relative frequencies. Of the images returned from the text label filter, 21,908 were labeled as "cherry blossom" by the computer vision API, but were removed because some images were also labeled as "autumn" or "maple tree", resulting in 21,633 image

Column D: A random sample of the resulting image for manual inspection

Column E: Number of images confirmed to be cherry blossoms by manual inspection

Column F: Estimated Accuracy for Monthly Automated Processing Methods (Computer Vision and Label Analysis), Calculated as E/D

Column G: Using this precision, calculate the total number of cherry blossom pictures taken in February, March, and April. The calculation method is C*F

 assessment method 

In order to estimate the blooming date of the cherry blossoms, the researchers generated a time series in days for all the images in the data set, and then processed them with a 7-day width triangular rolling average indicator (triangular rolling average), and the center point was assigned unity weight. The points adjacent to both sides are assigned a weight of 0.75 , and the next closest point is assigned a weight of 0.5 and 0.25 respectively, so as to smooth out the differences caused by the number of people viewing flowers on weekends (leisure time, photography activities have increased significantly) and weekdays. Photographic activity fluctuates.

The peak of shooting behavior shown in the resulting graph was identified as the peak of cherry blossom bloom (mankai).

 Comparison verification: the predicted results are consistent with the actual data

The earliest records of cherry blossom blooming in Japan can be traced back to 812 AD, and official observation records have been available since 1953. In order to verify the team's analysis method, the experimental team selected the data of two popular cherry blossom viewing cities, Tokyo and Kyoto, and compared it with the full blooming dates of cherry blossoms announced by the Japan Meteorological Corporation (JMC) and the Japan National Tourism Organization (JNTO) every year, and calculated The error between the peak date obtained from the experiment and the officially announced date is shown.

Through experiments, the research team obtained the visualized spatio-temporal data of cherry blossoms blooming across Japan. From late January (wks 3-4) to late May (wks 3-4), the blooming of cherry blossoms first gradually advances from the southern regions with a warm climate to the north. , and finally retreated gradually from south to north. as the picture shows:

c1c6bb8ba3089f3ff98ca1aa11315ae8.png

Figure 2: Japanese cherry blossom shooting locations from 2008 to 2018

The cycle of each graph corresponds to two weeks

AC: Cherry blossom images appear in the warmer regions of southern Japan, with a high concentration of images appearing in the urban centers of Tokyo and Kyoto on the island of Honshu

DF: Cherry blossom pictures increase, beginning to extend north of Honshu island

GI: The location of the cherry blossoms has expanded to the north, appearing in Sapporo, Hokkaido. The shooting activities in Tokyo and Kyoto are still active. In Hokkaido and the northern part of Honshu Island, the cherry blossom shooting activities are more concentrated. Finally, the number of cherry blossom photos across the country gradually decreased, receding from south to north.

The experimental team compared the processed peaks of the time series of cherry blossom photography days in Tokyo and Kyoto with the dates published by JMC/JNTO. The results show that the root mean square error is 3.21 days in Tokyo area and 3.32 days in Kyoto area. As shown below:

c70f5216bdd4012ab2ea196623d695b2.jpeg

Figure 3: Date comparison of the two assessments in the Tokyo area

Left column: The heyday dates of cherry blossoms in Tokyo over the years estimated by this experimental method

Middle column: The heyday of Tokyo cherry blossoms reported by JNTO over the years

Right column: error, that is, the difference in days between the two

6953f3aa7919ee5236a521f22a8e6754.png

Figure 4: Date comparison of the two assessments in the Kyoto area

Left column: The full bloom dates of cherry blossoms in Kyoto over the years estimated by this experimental method

Middle column: Kyoto cherry blossom peak dates reported by JNTO over the years

Right column: error, that is, the difference in days between the two

In the experimental team's data, the phenomenon of cherry blossoms blooming in autumn was also revealed. This is not formally pointed out in the data released by JNTO, which reflects the ability of SNS data to analyze small probability events and reveal abnormal phenological phenomena , such as non-seasonal opening hours, which is useful for assessing the availability of seasons throughout the year and even unexpected situations. The availability of aromatic resources such as pollen and nectar is extremely important.

 SNS Data: Providing New Insights for Ecological Research

An article released by the World Meteorological Organization in April this year showed that the global average temperature in 2022 is 1.15°C higher than the average in 1850-1900. Human beings are slow to perceive climate change, and plants are extremely sensitive. Under the influence of global warming, not only Japanese cherry blossoms, but also flowering plants in many places in my country have been affected.

According to the cherry blossom observation data of Wuhan University, since the 1960s, the flowering period of cherry blossoms in Wuhan University has been significantly advanced. After 2000, the record has been continuously broken, and it was once advanced from late March to late February.

Before the 1990s, peonies in Heze, Shandong Province mainly bloomed in late April, and around 2010, it was brought forward to mid-April. In recent years, flowers can be observed in early April.

The flowering time of rapeseed also has a significant earlier trend. The rapeseed flowers in Wuyuan, Jiangxi saw flowers on February 22 this year, and entered the peak flowering period on March 13th. 30 years ago, rapeseed flowers generally bloomed in March. mid.

According to a report released by Kepios, as of April 2023, the number of global social media users will reach 4.8 billion, accounting for 59.9% of the total global population. On average, each person spends 2 hours and 24 minutes a day using social media applications, generating massive social network data , which is expected to provide new insights for ecological research.

The SNS analysis technology proposed by the author in this paper can fill in the missing parts in the public data, help researchers understand the different degrees of impact of climate change on flowering plants, and has positive significance for understanding the behavior of important pollinators such as bees and insects .

Reference article:

[1]https://www.sciencedirect.com/science/article/abs/pii/S0168192320303117

[2]https://link.springer.com/chapter/10.1007/978-4-431-66899-2_8

[3]http://sh.cma.gov.cn/sh/qxkp/qhbh/zhykp/202304/t20230425_5464832.html

[4]https://datareportal.com/social-media-users

Editor: Yu Tengkai

Proofreading: Cheng Anle

042f24f679e5c68d264133ef9524be3f.png

Guess you like

Origin blog.csdn.net/tMb8Z9Vdm66wH68VX1/article/details/131777730