最近在学习对抗学习在文本分类方面的论文，对抗训练在提高深度神经网络对图像分类的鲁棒性方面表现出了有效性和高效性。然而，对于文本分类，文本输入空间的离散特性使得基于梯度的对抗方法难以从图像域进行自适应。此外，现有的文本攻击方法虽然有效，但效率还不足以应用于实际的文本对抗训练。在这项工作中，我们提出了一种快速梯度投影方法( FGPM )来生成基于同义词替换的文本对抗样本，其中每个替换都是由原始词和候选词在梯度方向上的投影距离和梯度大小的乘积来评分的。实证评估表明，与竞争攻击基线相比，FGPM取得了相似的攻击性能和可转移性，同时比目前最快的文本攻击方法快20倍左右。这样的性能使我们能够将FGPM与对抗训练相结合，作为一种有效的防御方法，并扩展到大型神经网络和数据集。实验表明，基于FGPM ( ATF )的对抗训练显著提高了模型的鲁棒性，阻断了对抗样本的可迁移性，对模型泛化性没有任何衰减。

Requirements

3.1 环境配置

3.2 数据集

3.3 文件描述

四、实验过程

一、论文连接

论文名称《Fast Gradient Projection Method for Text Adversary Generation and Adversarial Training》（AAAI 2021）

https://arxiv.org/abs/2008.03709

二、代码下载

GitHub - JHL-HUST/FGPM: Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks

三、代码实现

Requirements

python 3.6.5
numpy 1.15.2
tensorflow-gpu 1.12.0
keras 2.2.0

3.1 环境配置

（1）创建虚拟环境

conda create -n FGPM python==3.6.5

（2）因为默认安装的numpy不是指定版本则需先卸载，再安装指定版本：

pip uninstall numpy

pip install numpy==1.15.2 https://pypi.tuna.tsinghua.edu.cn/simple #清华源下载速度较快

（3）安装tensorflow-gpu

pip install tensorflow-gpu==1.12.0 https://pypi.tuna.tsinghua.edu.cn/simple #清华源下载速度较快

（4）安装tensorflow

pip install tensorflow==1.12.0 https://pypi.tuna.tsinghua.edu.cn/simple #清华源下载速度较快

（5）因为默认安装的numpy不是指定版本则需先卸载，再安装指定版本：

pip uninstall keras

pip install keras==2.2.0 https://pypi.tuna.tsinghua.edu.cn/simple #清华源下载速度较快

3.2 数据集

本实验中使用了三个数据集。下载并分别将三个数据集的文件和放在目指定目录下：/data/ag_news，/data/dbpedia 和 /data/yahoo_answers

此项目有两个依赖项。下载并放入目录 glove.840B.300d.txt 和 counter-fitted-vectors.txt 放在 /data/目录下。

3.3 文件描述

textcnn.py,textrnn.py,textbirnn.py : CNN、LSTM 和 Bi-LSTM 的模型。

train.py: 正常或对抗训练模型。

utils.py : 用于构建字典、加载数据或处理嵌入矩阵等的辅助函数。

build_embeddings.py : 生成字典、嵌入矩阵和距离矩阵。

FGPM.py : 快速投影算法。

attack.py: 使用 FGPM 攻击模型。

Config.py: 数据集、模型和攻击的设置。

四、实验过程

1. 生成字典、嵌入矩阵和距离矩阵：

python build_embeddings.py --data ag_news --data_dir ./data/

您可以通过下载aux_files来使用我们预先生成的数据并将其放入目录中。/data/

2. 正常训练模型：

python train.py  --data ag_news -nn_type textcnn --train_type org --num_epochs=2 --num_checkpoints=2 --data_dir ./data/ --model_dir ./model/

（你会得到一个名为 like in path 的目录1583313019_ag_news_org/model/runs_textcnn)

您还可以通过下载runs_textcnn、runs_textrnn和runs_textbirnn来使用我们训练的模型。/model/

运行结果：

Using TensorFlow backend.
reading path: ./data/ag_news/train
reading path: ./data/ag_news/test
Dataset  ag_news  loaded!
Saving model to /media/hao/D49CF3CC9CF3A760/Demo/FGPM-main/model/runs_textcnn/1684111781_ag_news_org

Epoch 1
---------------------------
Training normal accuracy is 0.9026083333333333
Training adversarial accuracy is 0.9026083333333333
Validation accuracy is 0.9121052622795105
---------------------------

Epoch 1
---------------------------
Training normal accuracy is 0.9026083333333333
Training adversarial accuracy is 0.9026083333333333
Validation accuracy is 0.9121052622795105
---------------------------
Epoch 2
---------------------------
Training normal accuracy is 0.9600833333333333
Training adversarial accuracy is 0.9600833333333333
Validation accuracy is 0.9193421100315294
---------------------------

通过FGPM攻击正常训练的模型：

python attack.py --nn_type textcnn --data ag_news --train_type org --time 1583313019 --step 2 --grad_upd_interval=1 --max_iter=30  --data_dir ./data/ --model_dir ./model/

(请注意，您可能会得到另一个时间戳，请在 /model/runs_textcnn 中检查模型的文件名，这里要修改模型的文件名为1583313019_ag_news_org)

Using TensorFlow backend.
./model/runs_textcnn/1583313019_ag_news_org/checkpoints/model_2
The dictionary has 50001 words.
Enable stop words.
reading path: ./data/ag_news/test
Model accuracy on test set:
6987 / 7600 = 0.9193421052631578
Sample  200 samples to attack...
FGPM Attack: Computation graph created!
ITEMVALUE
Total Time For Attack:10.62996506690979
Model Accuracy of Test Set:0.9193421052631578
Model Accuracy Before Attack:0.895
Attack Success Rate:0.6089385474860335
Model Accuracy After Attack:0.35
Average Substitution Ratio:0.12125525319247654

通过ATFL训练模型以增强鲁棒性：

python train.py  --data ag_news -nn_type textcnn --train_type adv --num_epochs=10 --num_checkpoints=10 --grad_upd_interval=1 --max_iter=30 --data_dir ./data/ --model_dir ./model/

（你会得到一个名为 like in path 的目录1583313121_ag_news_adv/model/runs_textcnn)

您还可以通过下载runs_textcnn、runs_textrnn和runs_textbirnn来使用我们训练的模型。/model/

Epoch 1
---------------------------
Training normal accuracy is 0.8702416666666667
Training adversarial accuracy is 0.7223416666666667
Validation accuracy is 0.9042105266922399
---------------------------
Epoch 2
---------------------------
Training normal accuracy is 0.9140166666666667
Training adversarial accuracy is 0.870325
Validation accuracy is 0.9103947391635493
---------------------------
Epoch 3
---------------------------
Training normal accuracy is 0.9300416666666667
Training adversarial accuracy is 0.8945416666666667
Validation accuracy is 0.9147368449913827
---------------------------
Epoch 4
---------------------------
Training normal accuracy is 0.9410916666666667
Training adversarial accuracy is 0.909925
Validation accuracy is 0.9176315759357653
---------------------------
Epoch 5
---------------------------
Training normal accuracy is 0.9507416666666667
Training adversarial accuracy is 0.9202166666666667
Validation accuracy is 0.9205263162914076
---------------------------
Epoch 6
---------------------------
Training normal accuracy is 0.9579583333333334
Training adversarial accuracy is 0.9294916666666667
Validation accuracy is 0.9200000041409543
---------------------------
Epoch 7
---------------------------
Training normal accuracy is 0.9653166666666667
Training adversarial accuracy is 0.9373583333333333
Validation accuracy is 0.9184210535727049
---------------------------
Epoch 8
---------------------------
Training normal accuracy is 0.9693916666666667
Training adversarial accuracy is 0.9442
Validation accuracy is 0.9167105257511139
---------------------------
Epoch 9
---------------------------
Training normal accuracy is 0.9738666666666667
Training adversarial accuracy is 0.9494333333333334
Validation accuracy is 0.9147368387172097
---------------------------
Epoch 10
---------------------------
Training normal accuracy is 0.977325
Training adversarial accuracy is 0.9547416666666667
Validation accuracy is 0.9163157908540023
---------------------------

攻击 FGPM 的敌对训练模型：

python attack.py --nn_type textcnn --data ag_news --train_type adv --time 1583313121 --step 3 --grad_upd_interval=1 --save=True --max_iter=30 --data_dir ./data/ --model_dir ./model/

(请注意，您可能会得到另一个时间戳，请在 /model/runs_textcnn 中检查模型的文件名，这里要修改模型的文件名为1583313121_ag_news_adv，另将save改为save_to_file)

Using TensorFlow backend.
./model/runs_textcnn/1583313121_ag_news_adv/checkpoints/model_3
The dictionary has 50001 words.
Enable stop words.
reading path: ./data/ag_news/test
Model accuracy on test set:
6952 / 7600 = 0.9147368421052632
Sample  200 samples to attack...
FGPM Attack: Computation graph created!
ITEMVALUE
Total Time For Attack:10.898909568786621
Model Accuracy of Test Set:0.9147368421052632
Model Accuracy Before Attack:0.87
Attack Success Rate:0.022988505747126436
Model Accuracy After Attack:0.85
Average Substitution Ratio:0.05550612254621688

《Fast Gradient Projection Method for Text Adversary Generation and Adversarial Training》论文学习笔记

一、论文连接

二、代码下载

三、代码实现

Requirements

3.1 环境配置

3.2 数据集

3.3 文件描述

四、实验过程

猜你喜欢