深度学习射频干扰检测网络:Deep residual detection of Radio Frequency Interferencefor FAST

Deep residual detection of Radio Frequency Interference for FAST [MNRAS 2020]

[PDF]

MNRAS: Monthly Notices of the Royal Astronomical Society

【推荐阅读:CSDN 博客】

目录

Abstract

1  Introduction

2  RFI Detection Methods

2.1 Classical methods: SumThreshold as an example

2.2 Detecting RFI with AI

2.3  Analysis of previous methods

3  RFI-Net for RFI Detection at FAST

3.1 Architecture of RFI-Net

3.2 Residuals Unit

4  Basic Framework of Data and Experiments

4.1 Data used in the experiments

4.2 Experimental framework


Abstract

Radio frequency interference (RFI) detection and excision is one of the key steps in the data processing pipeline of the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The FAST telescope, due to its high sensitivity and large data rate, requires more accurate and efficient RFI flagging methods than its counterparts. In the last decades, approaches based upon artificial intelligence (AI), such as codes using Convolutional Neural Network (CNN), have been proposed to identify RFI more reliably and efficiently. However, RFI flagging of FAST data with such methods has often proved to be erroneous, with further manual inspections required. In addition, network construction as well as training dataset preparation for effective RFI flagging has imposed significant additional workloads. Therefore, rapid deployment and adjustment of AI approaches for different observations is impractical to implement with existing algorithms.

To overcome such problems, we propose a model named RFI-Net. With the input of raw data without any processing, RFI-Net can detect RFI automatically, producing corresponding masks without any alteration of the original data.

Experiments with RFI-Net using simulated astronomical data show that our model has outperformed existing methods in terms of both precision and recall. Besides, compared with other models, our method can obtain the same relative accuracy with less training data, thus saving effort and time required to prepare the training set. Further, the training process of RFI-Net can be accelerated, with overfittings being minimised, compared with other CNN codes. The performance of RFI-Net has also been evaluated with observing data obtained by FAST and Bleien Observatory. Our results demonstrate the ability of RFI-Net to accurately identify RFI with fine-grained, high-precision masks that required no further modification.

射频干扰 (RFI) 的检测与去除是 500 米口径球面射电望远镜 (FAST) 数据处理 pipeline 的关键步骤之一。FAST 望远镜由于其高灵敏度和大数据速率,需要比同类望远镜更精确和有效的 RFI 标记方法。在过去的几十年里,人们提出了基于人工智能 (AI) 的方法,如使用卷积神经网络 (CNN) 编码,以更可靠和有效地识别 RFI。然而,使用这种方法对 FAST 数据的 RFI 标记经常被证明是错误的,需要进一步的人工检查。此外,为了有效的 RFI 标记,网络构建以及训练数据集的准备也增加了大量额外的工作负载。因此,用现有的算法来实现人工智能方法对不同观测的快速部署和调整是不切实际的

为了克服这些问题,本文提出了射频识别网络模型 RFI- Net。RFI- Net 无需任何处理即可输入原始数据,自动检测 RFI,在不改变原始数据的情况下生成相应的 mask。

利用模拟天文数据进行射频识别网络实验表明,该模型在精度和查全率方面都优于现有方法。此外,与其他模型相比,本文的方法可以在较少的训练数据下获得相同的相对精度,从而节省了准备训练集的精力和时间。此外,与其他 CNN 代码相比,RFI-Net 的训练过程可以加快,可以最小化过拟合。利用 FAST 和 Bleien 天文台的观测数据对射频识别网络的性能进行了评价。结果表明,RFI- Net 能够准确识别具有细粒度、高精度 mask 的 RFI,无需进一步修改

1  Introduction

Any undesired signal received by radio telescopes can be referred to as radio frequency interference (RFI, see Mosiane et al. (2017)). The spectral coverage of radio astronomical observations often overlaps with radio emissions originating from modern civilisation. These sources of RFI negatively impact on the data analysis. They can be roughly classified into sporadic emissions, which may erupt occasionally with pulse-like structures, as well as persisting ones (e.g., see Offringa et al. (2010a) and An et al. (2017)). Construction equipment with electric motors, digital cameras, and other similar electronic devices all generate RFI of the former type, usually contaminating multiple channels in the frequency domain (wide band). TV, mobile phone infrastructure as well as radio stations on the other hand usually operate in designated bands. Their emissions, along with harmonics of their local oscillator frequencies, can give rise to the latter persisting narrow-band RFI. All RFI can either originate from devices directly related to radio telescopes themselves (including on-site electronics, network systems, data processing computers, etc.), or come from mobile or fixed sources outside the observatory. Besides, natural radio emitters like the ground, lightning, and the sun can also give off extra RFI (Indermuehle et al. 2016).

介绍射频干扰的危害/类型/来源:

射电望远镜接收到的任何不希望收到的信号都可以称为射频干扰 (RFI,参见 Mosiane et al.(2017))。射电天文观测的光谱覆盖范围经常与来自现代文明的无线电辐射重叠。这些射频干扰来源对数据分析产生了负面影响。它们可以大致分为零星放射 (sporadic emissions),即偶尔可能爆发类似脉冲的结构,以及持续放射 (persisting emissions) (例如,参见Offringa et al. (2010a) 和 An et al. (2017))。带有电机、数码相机和其他类似电子设备的建筑设备都会产生前者类型的RFI,通常会污染频域(宽带)的多个通道。另一方面,电视、移动电话基础设施以及广播电台通常在指定的频段运行。它们的辐射,连同它们的本振频率的谐波,可以引起后者持续的窄带射频干扰。所有的 RFI 要么来自与射电望远镜本身直接相关的设备 (包括现场电子设备、网络系统、数据处理计算机等),要么来自天文台以外的移动或固定来源。此外,地面、闪电和太阳等自然的无线电发射器也会发出额外的射频干扰 (Indermuehle et al. 2016)。

As can be seen in Fig. 1, typical RFI can be much brighter than background noise or astronomical signals. Therefore, radio observations with a wide band coverage are easily affected by interference caused by human activities. Some weaker RFI may not be easily distinguished from celestial sources.

Also, as occasionally seen in observed data of FAST, RFI may also occur in the form of fluctuations in baselines, appearing randomly at any time or frequency channel, as shown in Fig. 2. It should be noted that with higher amplitudes (although much lower than typical RFI), such fluctuations should not be mistaken as standing waves (Briggs et al. 1997) with quasi-stable periodic structures in the frequency domain (Popping & Braun 2008). Generally speaking, random fluctuations observed by FAST usually exhibit widths of the same order of magnitude as extragalactic HI lines, although they may show broad single-peaked structures, in contrast with HI line’s double-horned ones. Therefore, although such random fluctuations may not be simply classified as RFI, we still need to mitigate the effects of such weak signals in radio astronomical data.

介绍 RFI 在频谱图中的特征:

从图 1 可以看出,典型的 RFI 比背景噪声或天文信号亮得多。因此,宽带覆盖的无线电观测容易受到人类活动的干扰。一些较弱的射频干扰可能很难与天体源区分开来

此外,在  FAST 的观测数据中偶尔可以看到,RFI 也可能以基线波动的形式出现,在任何时间或频段随机出现,如图 2 所示。值得注意的是,当振幅较高 (尽管远低于典型 RFI) 时,这种波动不应被误认为是频域具有准稳定周期结构的驻波 (Briggs et al. 1997) (Popping & Braun 2008)。一般来说,FAST 观测到的随机波动通常表现出与银河系外 HI 线相同量级的宽度尽管它们可能显示出宽的单峰结构,而 HI 线是双角结构。因此,虽然这种随机波动可能不能简单地归类为 RFI,但我们仍然需要减轻这种微弱信号在射电天文数据中的影响

One of the most ideal and effective approaches to minimise the impact of RFI is to mitigate by establishing a Radio Quiet Zone (RQZ) surrounding a telescope to regulate RFI-emitting device operations in it (An et al. 2017). For example, the Australian Communications and Media Authority (ACMA) has arranged a radio quiet zone in Western Australia (Wilson et al. 2016) for safety operations of radio astronomical instruments, including the low-frequency facility of the planned Square Kilometre Array (SKA). Similarly, With a government order named ’Regulations for Protection of Electromagnetic Wave Quiet Zone’, Guizhou Province in China has also established an RQZ for the FAST telescope (see Haiyan Zhang et al. (2013), Guizhou Provincial People’s Government (2019), and Fig. 3), protecting its observing band from 70 − 3000 MHz (Nan et al. 2011).

介绍了无线电安静区:一种通过限定环境、不依靠算法/硬件的 RFI 消除方法

将射频干扰影响最小化的最理想和最有效的方法之一是在望远镜周围建立一个无线电安静区 (RQZ),以规范其射频干扰发射设备的操作 (An et al. 2017)。例如,澳大利亚通信和媒体管理局(ACMA) 在西澳大利亚安排了一个无线电静默区 (Wilson et al. 2016),用于无线电天文仪器的安全操作,包括规划的平方公里阵列 (SKA) 的低频设施。同样,政府颁布了一项名为“电磁波安静区保护条例”的命令,贵州在中国也建立了一个快速 RQZ 望远镜 (参见 Haiyan Zhang et al。(2013),贵州省人民政府(2019),和图 3),保护其从 70−3000 MHz 的观察频段 (Nan et al . 2011)。

Nevertheless, since emissions from sources like artificial satellites cannot be minimised by ground-based RQZs, and strong signals originating outside such areas can also be detected by radio telescopes, RFI detection still poses a challenge to radio observations, and the ability to detect RFI is an important issue for radio astronomical data reduction. The correctness and completeness of RFI flagging operations greatly affects the scientific output of each telescope. Moreover, the increase of human activities has boosted the complexity and occurrences of RFI, and complicated the task of RFI detections. Therefore, the aim of our study is to develop a suitable model to process the data acquired with FAST. Compared with other telescopes, such as Arecibo or Effelsberg, FAST has a superior sensitivity, thus rendering itself extremely vulnerable to RFI. As a result, FAST needs a more accurate RFI detection algorithm. Considering this telescope’s various scientific goals, such as neutral hydrogen sky surveys as well as pulsar observations, the requirements of rapid adjustment and deployment should also be considered.

研究意义:RQZ 显然存在问题,因为其不能消除所有 RFI。因此,更全面/更精准的 RFI 检测方法是极其重要的。

然而,由于地面 RQZ 无法将人造卫星等来源的辐射降至最低,而来自这些地区以外的强信号也可以被射电望远镜探测到,因此,RFI 探测仍然对射电观测构成挑战,而 RFI 的检测能力是射电天文数据缩减的一个重要问题。RFI 标记操作的正确性和完整性极大地影响着每台望远镜的科学输出。此外,人类活动的增加增加了 RFI 的复杂性和发生率,使 RFI 检测任务复杂化。因此,本文研究的目的是开发一个合适的模型来处理用 FAST 获取的数据。与其他望远镜相比,如阿雷西博望远镜或埃弗尔斯伯格望远镜,FAST 具有卓越的灵敏度,因此使其极容易受到射频干扰。因此,FAST需要一种更精确的 RFI 检测算法。考虑到该望远镜的各种科学目的,如中性氢天空观测 (neutral hydrogen sky surveys) 和脉冲星观测,也应考虑快速调整和部署的要求。

Traditional RFI detection methods are mainly based on threshold algorithms (Offringa et al. 2010b; Baan et al. 2004)), as well as the physical characteristics of the RFI in the time-frequency domain (e.g. the linear detection, see Wolfaardt (2016)). Related image-processing techniques are subsequently applied to improve the appearance on RFI detection edges. However, in actual applications, manual interventions are often required to refine or confirm the locations and ranges of interference, as well as to specify related parameters of the algorithm, thus greatly reducing the processing efficiencies.

传统的 RFI 检测方法主要基于阈值算法 (Offringa et al. 2010b;Baan et al. 2004)),以及 RFI 在时频域的物理特性 (如线性检测,参见Wolfaardt(2016))。随后应用相关的图像处理技术来改善 RFI 检测边缘的外观。但在实际应用中,往往需要人工干预来细化或确认干扰的位置和范围,并指定算法的相关参数,大大降低了处理效率

Currently, artificial intelligence (AI) technology represented by deep learning techniques has been used to detect RFI. Burd et al. (2018) has applied Recurrent Neural Network (RNN) to flag RFI, while Akeret et al. (2017b) has performed similar tasks with Convolutional Neural Network (CNN). Czech et al. (2018) combined RNN and CNN to classify transient RFI sources. The application of AI technology has greatly reduced the workloads of astronomers, thus increasing the efficiency of data reduction. Yet, compared with traditional methods, the detection accuracies of existing deep learning algorithms still show very little advantage. Such methods usually have relatively low robustness, thus being unable to identify complicated RFI effectively. For example, it can be clearly seen that data shown in Fig. 4 requires further processing. On the other hand, improving the accuracy blindly often leads to problems of overfitting. In addition, AI approaches could also be time consuming, and considerable efforts are required to prepare a large amount of training data to train the neural network. Thus, efficient and fast adaptation would be impractical for such algorithms.

分析了现阶段,基于深度学习射频识别检测的问题:3 点

目前,以深度学习技术为代表的人工智能 (AI) 技术已被用于射频识别检测。Burd et al. (2018) 应用了递归神经网络 (RNN) 来标记 RFI,而 Akeret et al. (2017b) 使用卷积神经网络 (CNN) 执行了类似的任务。Czech et al.(2018) 结合 RNN 和 CNN 对瞬态 RFI 源进行分类。人工智能技术的应用大大减少了天文学家的工作量,提高了数据精简的效率。然而,与传统方法相比,现有的深度学习算法的检测准确率仍然没有什么优势。此类方法通常 1)鲁棒性较低无法有效识别复杂的 RFI。例如,可以清楚地看到图 4 所示的数据需要进一步的处理。另一方面,2)盲目地提高精度往往会导致过拟合问题。此外,人工智能方法也可能是耗时的,3)需要相当多的努力准备大量的训练数据来训练神经网络。因此,高效和快速的适应对这类算法来说是不切实际的。

Inspired by these pioneering works, in this paper, we propose a new model which can improve the detection robustness without introducing artificial artefacts. Two types of residual learning (He et al. 2016b; Arsalan et al. 2019) units have been adopted for down- and up-sampling processes in our CNN-based model, thus improving the accuracy without the requirements for large amounts of data. In addition, our model obviates the need for pre-treatments or further polishing, thereby increasing the efficiency and reliability of the process.

受这些开创性工作的启发,本文提出了一种新的模型,可以在不引入人工因素的情况下提高检测的鲁棒性。在基于 CNN 的模型中的下采样和上采样过程采用了两种类型的残差学习单元 (He et al. 2016b;Arsalan et al. 2019) ,从而在不需要大量数据的情况下提高了精度。此外,本文的模型避免了对预处理或进一步抛光的需要,从而提高了过程的效率和可靠性

Thus, the work presented in this paper can be summarized as:

• A network architecture named RFI-Net providing ). higher accuracy and minimal false-positives for RFI detection.

• Two types of residual units are utilised which can lead to equally or more accurate detections with less training data.

• A standalone method without the need for additional operations to achieve high efficiency and reliability is proposed.

本文的工作可以总结为:

• 提出了 RFI-Net 的网络架构,RFI 检测具有更高的准确性和最低的 false-positives。

• 使用两种类型的残差单元,可以在较少的训练数据下实现相同或更准确的检测。

• 提出了一种无需额外操作的独立方法,以实现高效率和可靠性。

2  RFI Detection Methods

Generally speaking, RFI detection algorithms search for possible interference with specific signatures in observed data, and produce RFI masks marking positions of detected interference. Currently, traditional ways for RFI flagging include linear algorithms (i.e. SVD (Offringa et al. 2010b) and PCA (Wolfaardt 2016)), as well as threshold-based methods (SumThreshold (Offringa et al. 2010b) and CUMSUM (Baan et al. 2004)). Also, with the development of artificial intelligence in image recognition (G´omez-R´ıos et al. 2019) and natural language processing (Evans et al. 2019), AI-related algorithms have been invoked by various branches in astronomy, from classifications of variable stars with enhanced performance in light-curve classification benchmarks (Aguirre et al. 2018), to pulsar candidate identifications (Zhu et al. 2014). Also, in the radio band, efforts have been made to apply CNN (Akeret et al. 2017b) and RNN (Czech et al. 2018) for RFI detection.

一般来说,RFI 检测算法在观测数据中搜索具有特定特征的可能干扰,并产生 RFI mask 标记被检测干扰的位置。目前,传统的 RFI 标记方法包括线性算法 (即 SVD (Offringa et al. 2010b) 和 PCA (Wolfaardt 2016)),以及基于阈值的方法 (SumThreshold (Offringa et al. 2010b) 和 CUMSUM (Baan et al. 2004))。此外,随着人工智能在图像识别 (G´omez-R´ıos et al. 2019) 和自然语言处理(Evans et al. 2019) 方面的发展,人工智能相关算法已经被天文学的各个分支调用,从光曲线分类基准中性能增强的变星分类 (Aguirre et al. 2018),到脉冲星候选星识别 (Zhu et al. 2014)。此外,在无线电波段,已努力应用CNN (Akeret et al. 2017b) 和 RNN (Czech et al. 2018) 进行 RFI 检测。

2.1 Classical methods: SumThreshold as an example

Since the signal strength of RFI is usually much stronger than that of typical astronomical signals, classical algorithms are based on physical characteristics of RFI. One of the notable approaches is SumThreshold, which is one of the most widely used algorithms (Akeret et al. 2017b). Introduced by Offringa et al. (2010a), the SumThreshold method has been proved to yield the highest accuracy among classical detection algorithms. SumThreshold can also be applied in combination with other algorithms, such as curvature fitting, to achieve better results (Offringa et al. 2010a).

由于 RFI 的信号强度通常比典型的天文信号强得多,经典的算法都是基于 RFI 的物理特性。其中一个值得注意的方法是 SumThreshold,它是最广泛使用的算法之一 (Akeret et al. 2017b)。由Offringa et al. (2010a) 提出的 SumThreshold 方法已被证明是经典检测算法中准确率最高的方法SumThreshold 也可以与曲率拟合等其他算法结合使用,以获得更好的结果 (Offringa et al. 2010a)。

However, it is possible that the original SumThreshold method could mistake many ’good’ samples as RFI, if no additional rules have been applied. Taking the dataset [0, 0, 5, 6, 0, 0] from Offringa et al. (2010b) as an example, the data points with values of 5 and 6 contain strong interference. Since SumThreshold adopts a series of thresholds for average values of different-sized pixel combinations to identify RFI, we adopt a decreasing sequence \chi _1=7, \chi _2=5, \chi _3=4, \cdots,\chi _6=1.8 as thresholds of averaged values for 1, 2, 3, · · · , 6 pixels, respectively, as with Offringa et al. (2010b). With an averaged value larger than 6\chi _6 , all six samples in the example dataset should be marked as RFI. In order to avoid such mislabelling, it is common practice to inspect pixel combinations with increasing sizes in the dataset. If a threshold χn for n-pixel combinations classifies a certain area as RFI, the readings of marked pixels should be replaced by \chi _n, leading the final average of our example to be \frac{2\chi _6}{6}=0.6< \chi _6 Offringa et al. (2010b). However, the exact values of each threshold also need to be finely tuned, thus increasing the needs of manual interventions of this method. It is difficult to flag weak interference or unwanted baseline fluctuations with thresholding algorithms, since they may show flux levels similar to astronomical signals.   

然而,如果没有应用额外的规则,原始的 SumThreshold 方法可能会将许多 “好的” 样本误认为RFI。以 Offringa et al. (2010b) 的数据集 [0,0,5,6,0,0] 为例,值为 5 和 6 的数据点存在强干扰。自SumThreshold 采用一系列的阈值的平均值不同大小的像素组合识别射频识别,我们采用递减序列\chi _1=7, \chi _2=5, \chi _3=4, \cdots,\chi _6=1.8 作为阈值平均的值为 1,2,3 , · · · , 6 像素。如果平均值大于 6\chi _6,则示例数据集中的所有 6 个样本都应标记为 RFI。为了避免这种错误的标记,通常的做法是检查数据集中随着大小增加的像素组合。如果 n 像素组合的阈值χn将某个区域分类为 RFI,则标记像素的读数应用 \chi _n 代替,从而使本例的最终平均值为 \frac{2\chi _6}{6}=0.6< \chi _6 。然而,每个阈值的精确值也需要进行微调,因此增加了对该方法的手动干预的需求用阈值算法标记微弱干扰或不需要的基线波动是困难的,因为它们可能显示类似天文信号的通量水平

2.2 Detecting RFI with AI

RFI flagging using CNN-based models has been studied in recent years (e.g., see Akeret et al. (2017b). In terms of accuracy, currently, the best variation of CNN model should be the U-Net model described in Akeret et al. (2017b). Here we present an overview of CNN, along with descriptions of the U-Net model.

近年来已经研究了使用基于 CNN 模型的 RFI 标记 (例如,见Akeret et al. (2017b))。在准确性方面,目前 CNN 模型最好的变异应该是 Akeret et al. (2017b) 中描述的 U-Net 模型。在这里,我们介绍 CNN 的概况,以及 U-Net 模型的描述。

Computer vision adopts CNN as its main network scheme. The basic operation of CNN is convolution, in which the summed value of the product from the convolutional kernel is multiplying all sample values within an area covered by the kernel. Shallow convolutional operations extract textural information of images, while deep ones can integrate features obtained with shallow networks to get image semantics (Ronneberger et al. 2015). Thus, each layer of CNN performs convolution on one or more planes, extracting information from them, and applies pooling to reduce the volume of information. In this way, with multiple convolutions and pooling, specific information about certain areas of the image can be obtained. CNNs are suitable to make identifications in structural data, that is, data associated with spatially adjacent counterparts (e.g., images).

计算机视觉采用 CNN 作为其主要的网络方案。CNN 的基本操作是卷积,卷积核的乘积的求和值乘以核所覆盖区域内的所有样本值。浅层卷积运算提取图像的纹理信息,而深度卷积运算可以结合浅层网络获得的特征来获得图像语义 (Ronneberger et al. 2015)。因此,CNN 的每一层都在一个或多个平面上进行卷积,从中提取信息,并通过池化来减少信息量。这样,通过多次卷积和池化,就可以获得图像某些区域的具体信息。CNN 适用于识别结构数据,即与空间相邻的对应数据 (如图像) 相关的数据。

The U-Net model, originally proposed to meet the challenge as part of a workshop held prior to the IEEE International Symposium on Biomedical Imaging (ISBI) 2012 with notable results (Ronneberger et al. 2015), can be considered as CNNs with extended architectures, and have been adjusted for RFI detections (Akeret et al. 2017b). It utilises down- and up-sampling operations to extract required information (such as RFI) from original data, enabling the network to actually ’learn’ about the extracted characteristics. In this model, shallow-layered convolutions make identifications of fine features in data, e.g., RFI intersections, while deeper networks are used to splice such features into more abstract forms (Akeret et al. 2017b). Meanwhile, features extracted by pooling operations during down-sampling are passed as copies to the right side after several steps of upsampling, thus completing a U-shaped structure. With intensive tests, a structure with 3 layers and 64 feature graphs has achieved a good balance between flagging accuracy and computational cost. A structure like this can be seen in Figure 1 of Akeret et al. (2017b).

U-Net 模型,已经应用于 RFI 检测 (Akeret et al . 2017 b)。它利用向下和向上采样操作从原始数据中提取所需的信息 (如RFI),使网络能够实际 “学习” 提取的特征。在该模型中,浅层卷积对数据中的精细特征进行识别,如 RFI 交叉,而更深层次的网络则将这些特征拼接成更抽象的形式。同时,下采样时通过池操作提取的特征,经过几个上采样步骤后,作为副本传递到右侧,形成 u 型结构。经过密集的测试 ,一个包含 3 层和 64 个特征图的结构在标记精度和计算成本之间取得了很好的平衡。这样的结构可以在 Akeret et al. (2017b) 的图 1 中看到。

Akeret et al. (2017b) have also tested the U-Net model with data acquired by the Bleien Observatory (ETHZurich 2016). The visual inspections have been compared with results from U-Net, demonstrating the advantages of the latter technique with graphs as well as index scores. It has been proven that the U-Net shows an advantage over traditional RFI flagging algorithms.

Akeret et al. (2017b) 也用 Bleien 天文台获得的数据测试了 U-Net 模型。将可视化检查与 U-Net 的结果进行了比较,显示了后一种技术在图表和索引得分方面的优势。研究表明,U-Net 与传统的 RFI 标记算法相比具有一定的优势

2.3  Analysis of previous methods

RFI with repetitive temporal or spectral behaviours, such as radar or radio beacon emissions, can be best identified using linear detection-based algorithms, although such methods are not suitable to detect stochastic RFI, including pulse-like signals, and RFI with frequency drifts(Akeret et al. 2017b). In contrast, thresholding algorithms are more effective, if the observed background is relatively stable and the RFI is distributed discretely. Thresholding, with advantages of relatively fast execution speed, easy implementation, as well as high efficiency and robustness (with properly adjusted parameters), is best to flag strong RFI. And algorithms incorporating the SumThreshold method are especially popular in radio astronomy (Akeret et al. 2017b). However, in the presence of weak interference or baseline fluctuations with fluxes comparable to celestial sources, or broad-band signals/extremely large amount of RFI (which means that most channels are RFI-contaminated), thresholding methods would become less effective, if not useless.

具有重复时间或光谱行为的 RFI,如雷达或无线电信标发射,可以使用基于线性探测的算法来最好地识别,尽管这种方法不适合检测随机 RFI,包括类脉冲信号和频率漂移的 RFI (Akeret et al. 2017b)。相比之下,当观测背景相对稳定且 RFI 离散分布时阈值算法更有效阈值法具有执行速度快、易于实现、效率高、鲁棒性好 (通过适当调整参数)等优点,是标记强 RFI 的最佳方法。结合 SumThreshold 方法的算法在射电天文学中尤其流行 (Akeret et al. 2017b)。然而,如果存在通量与天源相当的微弱干扰或基线波动或宽带信号/极大量射频干扰 (这意味着大多数信道都受到射频干扰污染),阈值方法即使不是无用,也会变得不那么有效

On the other hand, methods invoking machine learning/deep learning techniques could greatly reduce manual involvements, thus enhancing the automation level of RFI detections. However, in terms of practical applications, these algorithms have not achieved satisfactory detection accuracy. As shown in Fig. 4, the U-Net model proposed by Akeret et al. (2017b) has identified too many time domain structures as RFI; however, interference in the frequency domain, as well as point-like RFI, have been largely ignored. Furthermore, a lot of noise (also described as false-positive) exists in the flagging results, especially at the edges of regions identified as RFI. Apparently, such performance needs to be further refined, which may decrease the efficiency of the algorithm as a result. Moreover, training neural networks like this often requires large amounts of pre-labelled data. While RFI flagging with raw data is a time-consuming and tedious task, considering the requirements for subsequent training, it would not be easy to make adjustments with existing approaches to improve their suitability for RFI detection in different observing configurations (Aguirre et al. 2018).

另一方面,采用机器学习/深度学习技术的方法可以大大减少人工干预,从而提高 RFI 检测的自动化水平。但是在实际应用中,这些算法都没有达到令人满意的检测精度。如图 4 所示,Akeret et al. (2017b) 提出的 U-Net 模型将过多的时域结构识别为 RFI;然而,频域中的干扰以及点状 RFI 在很大程度上被忽略了。此外,在标记结果中存在大量的噪声 (也称为 false positive),特别是在识别为 RFI 的区域边缘。显然,这种性能还需要进一步细化,这可能会导致算法效率的降低。此外,训练这样的神经网络通常需要大量预先标记的数据。虽然使用原始数据进行 RFI 标记是一项耗时且繁琐的任务,但考虑到后续训练的要求,很难对现有方法进行调整,以提高它们在不同观测配置下的 RFI 检测适用性 (Aguirre等,2018)。

3  RFI-Net for RFI Detection at FAST

To overcome the shortcomings of algorithms discussed so far, we propose a new model combining U-Net with a residual network. In this model, the corresponding layers are connected with short cuts, which are shown in Fig. 5 as horizontal lines connecting the left and right sides. Two types of residual units have been designed and constructed. In addition, two other hyperparameters have been introduced to further enhance the performance.

为了克服目前所讨论算法的不足,本文提出了一种 U-Net 和残差网络相结合的新模型。在该模型中,相应的层之间通过 short cuts 连接,如图 5 所示为左右两边的水平线。设计并构造了两种残差单元。此外,还引入了另外两个超参数来进一步增强性能。

3.1 Architecture of RFI-Net

We have constructed our model with two steps, the first one to add more layers to U-Net, as shown in Fig. 5(a). That is because, for deep learning algorithms, the depth of a certain neural network can be of great importance. Regardless of filter sizes or chosen widths, a deeper network usually can achieve more optimal results, compared with shallower ones with roughly the same temporal complexity (He & Sun 2015), since an increase in depth can lead to more extracted information. Furthermore, compared with a shallower network, a deeper one can significantly enhance the capability to perform tasks that are more computationally demanding. For the second step of our model construction, the residual units have been built into the network, as shown in Fig. 5(b). Identity mappings serve as short cuts to connect three convolutional layers. Two types of residual units have been designed in this work. The unit marked with orange dotted lines in Fig. 5(b) is used for down-sampling, thereby doubling the number of channels; whereas the unit indicated by blue dotted lines halves the number of channels for up-sampling. To support this structure, the network adopts batch normalisation to normalise the input data, and is equipped with a fast and stable optimiser. Detailed descriptions of the network structure are provided in Table 1.

通过两步构建模型,第一步是在 U-Net 中添加更多的层,如图 5(a) 所示。这是因为,对于深度学习算法来说,某个神经网络的深度可能是非常重要的。无论过滤器的大小或选择的宽度如何,与时间复杂度大致相同的较浅网络相比,较深的网络通常可以获得更优的结果 (He & Sun 2015),因为深度的增加可以导致更多的提取信息。此外,与较浅的网络相比,较深的网络可以显著增强执行计算要求更高的任务的能力。在模型构建的第二步,将残差单元构建到网络中,如图 5(b) 所示。身份映射可以作为连接三个卷积层的 short cuts。本文设计了两种残余单元。图 5(b) 中带有橙色虚线的单元用于下采样,从而使通道数量加倍;而蓝色虚线表示的单位将上采样的通道数减半。为了支持这种结构,网络采用批量归一化对输入数据进行归一化,并配备了快速稳定的优化器。网络结构的详细描述如表 1 所示。

3.2 Residuals Unit

The residual units have been added to prevent network degeneration. When a deep network no longer diverges despite implementations of various optimisations, it will begin to degrade: with an increasing network depth, the detection accuracy would finally meet a ’bottleneck’, with rapidly growing training errors afterwards. It should be noted that such errors introduced by network degradation are due to more biased calculations caused by a larger number of layers, rather than overfitting (He et al. 2016b).

为了防止网络退化,添加了残差单元。当一个深度网络不再发散,尽管实现了各种优化,它将开始退化:随着网络深度的增加,检测精度将最终遇到一个 “瓶颈”,随后训练误差迅速增长。需要注意的是,这种由网络退化引起的误差是由于层数较多而产生的偏置计算,而不是过拟合 (He et al. 2016b)。

Therefore, inspired by the residual network (He et al. 2016b), we introduce short cuts to our model, which are represented by the dotted boxes in Fig. 5

z = F(x) (1)

where x and z denote the input and output, respectively. Identity mapping in most residual networks can be represented as follows

z = F(x) + x (2)

Equation 2 expresses that the input should be directly added to the result of the convolutional operations. However, since identity mapping does not have the flexibility to resize, especially with respect to channel numbers, RFI-Net chooses to perform convolutional operations with a kernel size of 1 × 1. With the assistance of batch normalisation, the short cut can be expressed as

z = F(x) + H(x) (3)

where H(x) denotes the combination of 1 × 1 convolution, as well as batch normalisation. Our network design adopts two units corresponding to the processes of down-and upsampling to make adjustment of channel numbers possible. Details of the unit hyperparameters are presented in Tables 2 and 3.

因此,受残差网络的启发,在模型中引入了 short cuts,如图 5 中的虚线和公式 (1) 所示。其中 x 和 z 分别表示输入和输出。大多数残差网络中的身份映射可以表示为 (2) 式。公式 (2) 表示输入应该直接加到卷积运算的结果上。但是,由于身份映射没有调整大小的灵活性,特别是在通道数方面,所以 RFI-Net 选择使用核大小为 1 × 1 的卷积操作。在批处理规范化的帮助下,short cuts 可以表示为 (3) 式。式中 H(x) 为 1 × 1 卷积以及批处理归一化的组合。我们的网络设计采用两个单元,分别对应下行采样和上行采样的过程,可以调整通道数。表 2 和表 3 给出了单位超参数的详细信息。

By connecting three layers as a unit, the short cuts can stabilise the update process, and slow down the gradient disappearance resulting from the inability of the model’s middle layers to update the parameters effectively. Moreover, it is expected that models with short cuts should not only see improvements of their performance, but also achieve a faster convergence speed (Drozdzal et al. 2016). In addition, the long connecting structures used in the U-Net model could give rise to a slower learning rate with unstable parameter updates (Drozdzal et al. 2016). In contrast, a network equipped with short cuts makes a larger initial learning rate possible, thus accomplishing a faster convergence.

通过将三层模型连接为一个单元,可以稳定更新过程,并减缓由于模型中间层不能有效更新参数而导致的梯度消失。此外,我们期望具有 short cuts 的模型不仅能够看到其性能的提高,而且能够实现更快的收敛速度 (Drozdzal et al. 2016)。此外,U-Net 模型中使用的长连接结构会导致较慢的学习速率,参数更新不稳定 (Drozdzal et al. 2016)。而带有快捷键的网络则具有更大的初始学习率,从而实现更快的收敛。

4  Basic Framework of Data and Experiments

4.1 Data used in the experiments

We firstly adopted an astronomical simulator to simulate data captured by a radio telescope. The FAST telescope can be used to conduct observations of neutral hydrogen (HI), and the 21-cm line from the hyperfine transition of neutral hydrogen emitted at a rest frequency of ∼ 1420.4 MHz. Many software packages are available for HI simulation and data processing. The simulator we used for this study is HIDE (the HI Data Emulator), which was also used in the studies of Akeret et al. (2017b) to simulate the training dataset.

我们首先采用天文模拟器来模拟射电望远镜捕捉到的数据。FAST 望远镜可以用于观测 neutral hydrogen(HI),以及从静止频率为 ~ 1420.4 MHz 的 neutral hydrogen 的超精细跃进处的 21 厘米线。许多软件包可用于 HI 模拟和数据处理。本研究中使用的模拟器是 HIDE (HI Data Emulator), Akeret et al. (2017b) 的研究也使用 HIDE 来模拟训练数据集。

The simulated data are comprised of astronomical data and RFI. Since HIDE can produce both ’pure’ RFI and simulated astronomical data (that already contain RFI), as shown in the top left panel of Fig. 6, it is possible to label all the interference precisely as fundamental references (that is, the ground truth) for our experiments. By comparing RFI detected by various algorithms with the ground truth, the accuracy of each method can be evaluated. The simulated RFI is displayed in the top centre panel of Fig. 6, with a corresponding RFI mask shown in the top right.

模拟数据由天文数据和射频识别数据组成。由于 HIDE 可以产生 “纯” RFI 和模拟的天文数据 (已经包含 RFI),如图 6 的左上角面板所示,因此有可能为我们的实验精确地标记所有干扰作为基本参考 (即 ground truth)。通过将各种算法检测到的 RFI 与 ground truth 值进行比较,可以评估每种方法的准确性。模拟的 RFI 显示在图 6 中上方的面板中,对应的 RFI mask 显示在右上角。

Experiments have also been conducted with manuallylabelled observed datasets captured by FAST in September, 2018, as well as open access data (ETHZurich 2016) acquired by the Bleien Observatory on March 21st, 2016 (Akeret et al. 2017a).

实验还使用了人工标记的 FAST 于2018年9月捕获的观测数据集,以及 Bleien 天文台于 2016年3月21日获得的开放获取数据 (ETHZurich 2016) (Akeret et al. 2017a)。

4.2 Experimental framework

For the purpose of comparison, our dataset has been processed by several existing RFI flagging methods, including U-Net, KNN (K-Nearest Neighbour (Guo et al. 2003), one of the classification algorithms used in machine learning), as well as SumThreshold. Similar to Akeret et al. (2017b), we have found that U-Net with three layers and 64 characters can achieve a good balance between accuracy and speed, after several cycles of tests. Thus, in this study, we have adopted the same structure and hyperparameters for RFI-Net as Akeret et al. (2017b). In addition, this study applied Scikit-learn (Pedregosa et al. 2011), the machine-learning library for Python, to assist with the implementation of the KNN algorithm.

为了便于比较,我们的数据集使用了几种现有的RFI标记方法,包括 U-Net、KNN (K-Nearest Neighbour (Guo et al. 2003),机器学习中使用的一种分类算法) 和 SumThreshold。与 Akeret et al. (2017b)相似,我们发现 U-Net 有三层 64 个 characters ,经过几个周期的测试,可以在准确性和速度之间达到很好的平衡。因此,在本研究中,我们对 RFI-Net 采用了与 Akeret et al. (2017b) 相同的结构和超参数。此外,本研究应用了 Python 的机器学习库 Scikit-learn (Pedregosa et al. 2011) 来辅助 KNN 算法的实现。

Specifically, the following experiments have been conducted: 1) We compared the results obtained from RFI-Net for simulated data with those from previous methods; 2) a detailed analysis of algorithm accuracies was carried out; 3) the results obtained from RFI-Net for observed data were studied to determine whether additional operations were required; 4) We use RFI-Net to process less training data (i.e. 25%, 50%, and 75% of the complete data) to validate its ability to achieve comparable performance on smaller datasets; 5) the ability of RFI-Net to overcome the overfitting problem was demonstrated; 6) the training speed was investigated.

具体进行了以下实验:

1) 对模拟数据进行射频识别的结果与以往方法的结果进行了比较

2) 详细分析了算法的精度

3) 研究 RFI-Net 对观测数据的结果,以确定是否需要额外的操作

4) 使用 RFI-Net 处理较少的训练数据 (即25%、50%和75%的完整数据),以验证其在较小的数据集上实现可比性能的能力

5) 验证了射频识别网络克服过拟合问题的能力;

6) 对训练速度进行了研究。

All these methods have been tested under Ubuntu 16.04 on a Dell PowerEdge T630 with 32 GB RAM. Those deep learning models have been executed on a NVIDIA Tesla K40c with 12 GB RAM.

所有这些方法都已经在 Ubuntu 16.04 下的 Dell PowerEdge T630 (32gb RAM) 上测试过了。这些深度学习模型已经在 NVIDIA Tesla K40c 上执行,它的内存为 12 GB。

The indicators we adopted for experimental evaluations are precision, recall, as well as the F1 score see Akeret et al. (2017b) and Davis & Goadrich (2006):

Precision, which is the fraction of genuine RFI among all flagged instances, can be considered as the accuracy metric of the algorithm

Recall indicates the fraction of RFI that have been identified among all RFI, showing the comprehensiveness of the detection 

 

 F1, the reciprocal mean of precision and recall, can be considered as the overall model performance

 

It should be noted that, since our observed data were labelled manually, with no guarantee of completeness of RFI flags, the related performances cannot be evaluated exactly with indicators listed above. They rely on visual inspections, and can only provide a general approximation of algorithm characteristics. 

我们采用的实验评价指标是精度、召回率 recall 和 F1 得分,见 Akeret et al. (2017b) 和 Davis & Goadrich (2006):

精度是所有标记实例中真实 RFI 的比例,可以认为是算法的精度度量 -- 公式 (8);

召回率表示在所有 RFI 中已识别的 RFI 的比例,显示了检测的全面性 -- 公式 (9);

F1 是精度和召回率的倒数,可以认为是模型的整体性能 -- 公式 (10);

需要注意的是,由于我们的观测数据是手工标记的,没有保证 RFI 标志的完整性,因此不能用上面列出的指标准确地评估相关性能。它们依赖于视觉检查,只能提供算法特征的一般近似。

猜你喜欢

转载自blog.csdn.net/u014546828/article/details/120660361