How to add random noise to the original feature matrix, and combine the effect of the deep learning model to compare the performance of two different noise schemes

Author: Zen and the Art of Computer Programming

1 Introduction

When we train a machine learning model, we usually use a large number of features as input, and these features can usually help the model to better complete the classification task. However, if there are outliers in the values ​​of certain features , such as missing values, wrong values, repeated values, etc., it may affect the accuracy and stability of the model . In order to solve this problem, a common method is to introduce noise, that is, to perturb or replace the original features .

Generally speaking, the methods of introducing noise can be divided into two categories:

  • Add random noise to the original feature: This method is to replace a certain eigenvalue of each sample with a random number in the original feature matrix, such as from a normal distribution, a uniform distribution, etc. The purpose of this is to simulate samples that rarely occur in real data, so that the model can better adapt to new data sets. However, this method is prone to lead to model overfitting.
  • Using noise labels: Another way is to use noise labels, that is, to assign specific labels to some samples instead of real labels. As an example, suppose we have 10 samples, 9 of which are labeled as normal, and only the 10th sample is labeled as malicious. Then we can change the labels of all samples to normal in the first step, and change the label of the 10th sample to malicious in the second step. The purpose of this is to try to make the model learn the characteristics of normal samples, and at the same time be able to distinguish them from malicious samples. Since the proportion of noisy labels is extremely low, this method will not cause overfitting.

This article focuses on how to add random noise to the original feature matrix, and compares the performance of two different noise schemes based on the effect of the deep learning model.

2. Relevant knowledge points

2.1. Overview of deep learning models

First, you need to know what a deep learning model is. Deep learning models usually include feature extractors (feature extractor

Guess you like

Origin blog.csdn.net/universsky2015/article/details/132222916