Weekly report from November 25th to December 1st, 2023 (continue debugging OpenFWI code)

Table of contents

I. Introduction

2. Learning situation

2.1 Understanding train.py

2.11 Define data set

2.12 Define loss function and optimizer

2.13 Load model

2.14 Start training

2.2 Understanding test.py

2.21 Define test set

2.22 Load model

2.23 Start test

3. Some problems encountered and their solutions

3.1 module 'torchvision' has no attribute '__version__'

3.2 Understanding __call__ in Python

4. Related references

4.1 Usage of keyword global

4.2 Python creates folders with mkdir() and makedirs()

4.3 os.environ module environment variable explanation

4.4 Use of Tensorboard ---- SummaryWriter class

4.5 The role of torch.backends.cudnn.benchmark = true

4.6 MinMaxNormalize normalization algorithm

4.7 Introduction and usage of symbolic function np.sign()

4.8 Operation of ndarray multi-dimensional array in numpy library: np.abs()

4.9 Introduction to np.log1p for data smoothing 

4.10 Introduction to DistributedSamper()

4.11 Understanding RandomSampler()

4.12 Introduction to argparse module

4.13 Introduction to os.path.join() function

4.14 Compose() function

5. Summary

5.1 Existing doubts

5.2 Schedule for next week


I. Introduction

        Last week, I copied the InversionNet network part of the OpenFWI code. This week, I continued to complete the remaining part of the copying task and preliminary read the OpenFWI paper.

2. Learning situation

2.1 Understanding train.py

2.11 Define data set

  • Create a path to save the file and initialize the distributed mode:
utils.mkdir(args.output_path)
utils.init_distributed_mode(args)
  • Determine training equipment:
device = torch.device(args.device)
  • Determine how data and labels are normalized:
# Normalize data and label to [-1, 1]   将数据归一化在[-1, 1]
transform_data = Compose([
    # 数据平滑处理
    T.LogTransform(k=args.k),
    # 归一化让不同维度之间的特征在数值上有一定比较性,可以大大提高分类器的准确性。
    T.MinMaxNormalize(T.log_transform(ctx['data_min'], k=args.k), T.log_transform(ctx['data_max'], k=args.k))
])
transform_label = Compose([
    T.MinMaxNormalize(ctx['label_min'], ctx['label_max'])
])
  •  Initialize the training set and validation set:
# 判断文件类型是否为txt
if args.train_anno[-3:] == 'txt':
    # 数据加载
    dataset_train = FWIDataset(
        args.train_anno, # 文件的路径
        preload=True, # 是否将整个数据集加载到内存中
        sample_ratio=args.sample_temporal, # 地震数据的下采样率
        file_size=ctx['file_size'], # 每个npy文件中的样本数
        transform_data=transform_data, # 数据转换
        transform_label=transform_label # 标签转换
    )
else:
    dataset_train = torch.load(args.train_anno)

print('Loading validation data')
if args.val_anno[-3:] == 'txt':
    dataset_valid = FWIDataset(
        args.val_anno,
        preload=True,
        sample_ratio=args.sample_temporal,
        file_size=ctx['file_size'],
        transform_data=transform_data,
        transform_label=transform_label
    )
else:
    dataset_valid = torch.load(args.val_anno)
  •  Load the dataset:
if args.distributed:
    train_sampler = DistributedSampler(dataset_train, shuffle=True) # 分布式网络训练
    valid_sampler = DistributedSampler(dataset_valid, shuffle=True)
else:
    train_sampler = RandomSampler(dataset_train) # RandomSampler()表示随机对数据样本进行采样,返回的是DataSet中的索引位置(indices)
    valid_sampler = RandomSampler(dataset_valid)

# 读取数据,将train_sampler与valid_sampler加载到网络模型中进行训练
dataloader_train = DataLoader(
    dataset_train, # 传入的数据集
    batch_size=args.batch_size, # 每个batch的样本数
    sampler=train_sampler, # 自定义从数据集中取走样本的策略,若指定这个参数,shuffle必须为false
    # num_workers=args.workers,
    num_workers = 0, # 设置进程数
    pin_memory=True, # 设置为True表示dalaloader在返回它们之前,会将tensors拷贝到CUDA中的固定内存中
    drop_last=True, # 设置为True表示丢弃最后一批样本
    collate_fn=default_collate # 将一个list的sample组成一个mini-batch的函数
)

dataloader_valid = DataLoader(
    dataset_valid, batch_size=args.batch_size,
    sampler=valid_sampler,
    # num_workers=args.workers,
    num_workers=0,
    pin_memory=True, collate_fn=default_collate)

2.12 Define loss function and optimizer

  • Loss function:
l1loss = nn.L1Loss() # 平均绝对误差(MAE)
l2loss = nn.MSELoss() # 均方误差(MSE)
  • Learning rate:
lr = args.lr * args.world_size
warmup_iters = args.lr_warmup_epochs * len(dataloader_train)
lr_milestones = [len(dataloader_train) * m for m in args.lr_milestones]
# 确定学习率调整策略(学习率预热)
# (warmup策略-开始以很小的学习率进行训练,使得网络熟悉数据,随着训练的进行学习率慢慢变大,到达一定程度,以设置的初始学习率进行训练,通过一些inter后,学习率再慢慢减少)
# 学习率变化:上升—平稳—下降
lr_scheduler = WarmupMultiStepLR(
    optimizer, milestones=lr_milestones, gamma=args.lr_gamma,
    warmup_iters=warmup_iters, warmup_factor=1e-5)
  • Optimizer:
optimizer = torch.optim.AdamW(model.parameters(), lr=lr, betas=(0.9, 0.999), weight_decay=args.weight_decay)

2.13 Load model

model_without_ddp = model
# 判断是否采用分布式训练
if args.distributed:
    # DistributedDataParallel—DDP分布式多卡训练(可实现单机多卡、多机多卡)
    model = DistributedDataParallel(model, device_ids=[args.local_rank])
    model_without_ddp = model.module

# 判断是否加载预训练模型
if args.resume:
    checkpoint = torch.load(args.resume, map_location='cpu')
    model_without_ddp.load_state_dict(network.replace_legacy(checkpoint['model']))
    optimizer.load_state_dict(checkpoint['optimizer'])
    lr_scheduler.load_state_dict(checkpoint['lr_scheduler'])
    args.start_epoch = checkpoint['epoch'] + 1
    step = checkpoint['step']
    lr_scheduler.milestones = lr_milestones

2.14 Start learning

  • Train a batch:
# 在该函数中会进行损失函数的计算、反向传播、更新模型、将L1、L1以及混合损失记录在Tensorboard的writer中等
train_one_epoch(model, criterion, optimizer, lr_scheduler, dataloader_train, device, epoch, args.print_freq, train_writer)

# 如下所示:
ptimizer.zero_grad()
data, label = data.to(device), label.to(device)
output = model(data)
loss, loss_g1v, loss_g2v = criterion(output, label)
loss.backward()
optimizer.step()
  • Evaluation model:
loss = evaluate(model, criterion, dataloader_valid, device, val_writer)
  • Save the trained model:
# 文件中保存的内容:建立字典
# 注意:若模型是由nn.Model类继承的模型,保存pth文件时,state_dict参数需要由model.state_dict指定
checkpoint = {
    'model': model_without_ddp.state_dict(),
    'optimizer': optimizer.state_dict(),
    'lr_scheduler': lr_scheduler.state_dict(),
    'epoch': epoch,
    'step': step,
    'args': args}

# Save checkpoint per epoch
# pth文件通过有序字典来保持模型参数
# 当想要回复某一阶段的训练(或者进行测试)时,可以读取之前保存的网络模型参数
if loss < best_loss:
    utils.save_on_master(
        checkpoint,
        os.path.join(args.output_path, 'checkpoint.pth'))
    print('saving checkpoint at epoch: ', epoch)
    chp = epoch
    best_loss = loss
# Save checkpoint every epoch block
print('current best loss: ', best_loss)
print('current best epoch: ', chp)
if args.output_path and (epoch + 1) % args.epoch_block == 0:
    utils.save_on_master(
        checkpoint,
        os.path.join(args.output_path, 'model_{}.pth'.format(epoch + 1)))

Note: Regular dict is unordered, OrderedDict can handle frequent reordering operations better than dict.​ 

2.2 Understanding test.py

2.21 Define test set

  • Create a file saving path, determine training equipment, and set up to optimize operating efficiency.
utils.mkdir(args.output_path) # 创建相对路径的文件夹
device = torch.device(args.device) # 确定训练设备
torch.backends.cudnn.benchmark = True # 优化运行效率
  •  Determine how to preprocess data and labels
transform_valid_data = Compose([
    T.LogTransform(k=args.k),
    T.MinMaxNormalize(log_data_min, log_data_max),
])

transform_valid_label = Compose([
    T.MinMaxNormalize(ctx['label_min'], ctx['label_max'])
])
  • Initialize test set
if args.val_anno[-3:] == 'txt':
    dataset_valid = FWIDataset(
        args.val_anno,
        sample_ratio=args.sample_temporal,
        file_size=ctx['file_size'],
        transform_data=transform_valid_data,
        transform_label=transform_valid_label
    )
else:
    dataset_valid = torch.load(args.val_anno)
  • Load dataset
valid_sampler = SequentialSampler(dataset_valid)
dataloader_valid = torch.utils.data.DataLoader(
    dataset_valid, batch_size=args.batch_size,
    sampler=valid_sampler, num_workers=args.workers,
    pin_memory=True, collate_fn=default_collate
)

2.22 Load model

model = network.model_dict[args.model](upsample_mode=args.up_mode,
                                       sample_spatial=args.sample_spatial,
                                       sample_temporal=args.sample_temporal,
                                       norm=args.norm).to(device)

2.23 Start test

evaluate(model, criterions, dataloader_valid, device, args.k, ctx,
         vis_path, args.vis_batch, args.vis_sample, args.missing, args.std)

3. Some problems encountered and their solutions

3.1 module 'torchvision' has no attribute '__version__'

        Problem description:module 'torchvision' has no attribute '__version__'; This problem is caused by the version problem of the torchvision library. In older versions, may not have the 'version' attribute.

        参考:AttributeError: module 'torchvision' has no attribute 'version' - CSDN文库 

        Solution:Upgrade the torchvision library to the latest version. The command is as follows:

pip install --upgrade trochvision -i https://pypi.tuna.tsinghua.edu.cn/simple

3.2 Understanding __call__ in Python

       When debugging code, the understanding of class calls is not enough, which makes it more troublesome when learning.

transform_data = Compose([
    # 数据平滑处理
    T.LogTransform(k=args.k),
    # 数据归一化
    # 创建MinMaxNormalize对象
    T.MinMaxNormalize(T.log_transform(ctx['data_min'], k=args.k), T.log_transform(ctx['data_max'], k=args.k))
])

# 数据加载
dataset_train = FWIDataset(
        args.train_anno,
        preload=True,
        sample_ratio=args.sample_temporal,
        file_size=ctx['file_size'],
        transform_data=transform_data,
        transform_label=transform_label
)

# 读取数据
dataloader_train = DataLoader(
    dataset_train, # 传入的数据集
    batch_size=args.batch_size, # 每个batch的样本数
    sampler=train_sampler, # 自定义从数据集中取走样本的策略,若指定这个参数,shuffle必须为false
    num_workers=args.workers, # 设置进程数
    pin_memory=True, # 设置为True表示dalaloader在返回它们之前,会将tensors拷贝到CUDA中的固定内存中
    drop_last=True, # 设置为True表示丢弃最后一批样本
    collate_fn=default_collate # 将一个list的sample组成一个mini-batch的函数
)

# 开始训练
train_one_epoch(model, criterion, optimizer, lr_scheduler, dataloader_train, device, epoch, args.print_freq, train_writer)

# 训练一个轮次的模型——完成数据转换
def train_one_epoch(model, criterion, optimizer, lr_scheduler, dataloader, device, epoch, print_freq, writer):

    for data, label in metric_logger.log_every(dataloader, print_freq, header):

def log_every(self, iterable, print_freq, header=None):
    for obj in iterable:

        Reference:python special function __call__(self) - Zhihu (zhihu.com)

4. Related references

4.1 Usage of keyword global

        global is the keyword for global variables in Python. When defining a function, if you need to operate variables outside the function inside the function, you need to declare the variables outside the function as global variables inside the function.

        In the add() function, global is not added before a, so the add() function cannot assign a to 3, cannot modify the value of a, and the value of a outside the function has not changed.

Notice:

  1. Variables are divided intoglobal variables (can be created by objects or functions, or can be created anywhere in the program. After successful creation, they can be Referenced by all objects or functions in the program) and local variables (internal variables - created by an object or function, can only be referenced internally and cannot be used by other Object or function reference);
  2. global needs to be declared insidefunction, and you can use the same global statement to specify multiple global variables;
  3. Global variables cannot use local variables, only the corresponding local scope is valid.
  4. Reference:Usage of keyword global in python_python global_ Marks’ blog-CSDN blog
a = 1
b = 2

def add():
    a = 3
    global b
    b = 4
    print("② a + b =", a, "+", b,  "=", a + b)

print("① a + b =", a, "+", b,  "=", a + b)
add()
print("③ a + b =", a, "+", b,  "=", a + b)

4.2 Python creates folders with mkdir() and makedirs()

        mkdir() command is used to create first-level directory; makedirs()The command is used to createmulti-level directories.

        Reference:Python creates folders with mkdir() and makedirs()_mkdir python_Xunluo Ting Zi Xiaotingzi’s Blog-CSDN Blog

        The screenshots of the code and directory creation are as follows:

import os

output_path1 = "output1/"  # 模型保存地址
os.mkdir(output_path1)  # only create one folder
output_path2 = "output2/output2"  # 模型保存地址
os.makedirs(output_path2)  # create more then one folder

4.3 os.environ module environment variable explanation

        os.environ is a dictionary of environment variables. Environment variables are the communication method between the program and the operating system. Various information about the system can be obtained through os.environ in Python.

        Reference:Detailed explanation of os.environ module environment variables-CSDN Blog

import os
print(os.environ.keys()) # 打印os.environ.keys()主目录下所有的key
print(os.environ.get("HOME")) # 获取环境变量,若有这个键,返回对应的值;反之,返回none
print(os.environ.get("HOME", "default"))	#环境变量HOME不存在,返回	default

# 设置系统环境变量
os.environ['环境变量名称']='环境变量值' #其中key和value均为string类型
os.putenv('环境变量名称', '环境变量值')
os.environ.setdefault('环境变量名称', '环境变量值')

# 修改系统环境变量
os.environ['环境变量名称']='新环境变量值'

# 获取系统环境变量
os.environ['环境变量名称']
os.getenv('环境变量名称')
os.environ.get('环境变量名称', '默认值')	#默认值可给可不给,环境变量不存在返回默认值

# 删除系统环境变量
del os.environ['环境变量名称']
del(os.environ['环境变量名称'])

# 判断系统环境变量是否存在
'环境变量值' in os.environ   # 存在返回 True,不存在返回 False

4.4 Use of Tensorboard ---- SummaryWriter class

       SummaryWriter:Creates an event file in the given directory and adds summaries and events to it. This class updates the file contents asynchronously, which allows the trainer to call methods to add data to the file directlyfrom the training loop without slowing down the training .

from torch.utils.tensorboard import SummaryWriter

        Reference:Usage of Tensorboard ---- SummaryWriter class (pytorch version)_chuanauc's blog-CSDN blog

        The figure below shows that two instances of train_writer and val_trainwriter are built using the SummaryWriter class. If both are true, the corresponding two folders will appear after running, which contain files that can be interpreted by tensorboard.

         Visual display:In the terminal of Pycharm, type the command tensorboard –logdir=XXXX (where XXXX refers to the place where the file is written)

4.5 The role of torch.backends.cudnn.benchmark = true

        In many cases, there is a line like this in the code:

torch.backends.cudnn.benchmark = true

        The above code canimprove the training speed. The program will spend a little extra time at the beginning to provide each convolutional layer of the entire network. Search for the most suitable convolution implementation algorithm to achieve network acceleration. However, there is randomness in the calculation, and the results of the network may be a little different each time.

        have to be aware of is:

  1. If the input data dimensions or types of the network do not change much, the above settings can increase operating efficiency;
  2. If the input data of the network changes in each iteration, it will cause cnDNN to search for the optimal configuration every time, which will reduce the operating efficiency;
  3. Therefore, if cuDNN uses a non-deterministic algorithm, it can be disabled by torch.backends.cudnn.enabled = False.

        Reference:The role of torch.backends.cudnn.benchmark = true-CSDN Blog

4.6 MinMaxNormalize normalization algorithm

        Data normalization meansscaling the data according to a certain ratio (conversion) so that it falls into a small specific interval. Convert to a decimal between (0,1) or (-1,1) to convert a dimensional expression into a dimensionless expression. In a multi-index evaluation system, different evaluation indexes have different properties (dimension, order of magnitude, etc.). When the level gap between each index is large, if the original data is used for analysis directly, the indicators with higher values ​​will be highlighted in terms of comprehensiveness. The role in analysis relatively weakens the role of indicators with low value levels. After the original data is processed through data standardization, each indicator is in the same order of magnitude, which is suitable for comprehensive comparative analysis.

         Reference:How to understand normalization? - Zhihu (zhihu.com)

         The formula used in this article is as follows:

vid = (\frac{vid-vmin}{vmax-vmin}-0.5)*2

where vmin represents the minimum value of seismic data/velocity model, and vmax represents the maximum value of seismic data/velocity model.

4.7 Introduction and usage of symbolic function np.sign()

        np.sign() is a function that takes the sign of a number (the sign before the number) in Python’s Numpy:

        Reference:[Python] Introduction and usage of the symbolic function sign() of the Numpy library_python sign-CSDN Blog 

sign(x)=\left\{\begin{matrix} 1,x>0\\ 0,x=0\\ -1,x<0\end{matrix}\right.

import numpy as np # 导入numpy库

data = [-0.8, -1.1, 0, 2.3, 4.5]
print("输入数据为:", data)

# 使用numpy的sign(x)函数求输入数据的符号
signResult = np.sign(data)
print("使用sign函数的输出符号为:", signResult)

4.8 Operation of ndarray multi-dimensional array in numpy library: np.abs()

        np.abs() means calculating the absolute value of each element in the array (multiple elements are processed in parallel), and the returned type is ndarray.

        Reference:Don’t be confused between Python numpy.abs and abs functions_Little Bear Loves Milk’s Blog-CSDN Blog

        It should be noted that np.abs() and abs() need to be distinguished. abs() is suitable for processing single elements, and the return type is int.​ 

4.9 Introduction to np.log1p for data smoothing 

        When performing data preprocessing, you can first use the log1p function to transform data with large skewness to make it more compliant with Gaussian distribution, which is more conducive to obtaining better classification results in the future. log1p compresses the data into a range, similar to the normalization of the data.

        Note: Since log1p is used to compress the data, the most predicted smooth data needs to be restored, using expm1, the inverse operation of log1p.

log1p=log(x+1),Right nowln(x+1)

expm1=exp(x)-1

        Reference:Data smoothing - log1p() and exmp1()-CSDN Blog

4.10 Introduction to DistributedSamper()

         DistributedSampler ensures that the test data set is loaded in a fixed order. DistributedSampler() is located in torch.utils.data. This class is usually used for distributed single-machine multi-card (or multi-machine multi-card) audit network training. When using it, you first need to initialize DistributedSampler and pass the object as a parameter into the sampler parameter of rotch.utils.data.DataLoader(). At this time, DataLoader has the capability of distributed sampling. Taking a single machine with multiple cards as an example, if there are N graphics cards in the current environment, the entire data set will be divided into N parts, and each graphics card will obtain its own share of data. As follows:. The total data of an epoch/num_gpu=the number obtained by each GPU; the number of iterations of each GPU of an epoch=the data obtained/batch_size

from torch.utils.data import DistributedSampler,DataLoader
from torchvision import datasets

dataset = datasets.ImageFolder(data_path, transform)
sampler = DistributedSampler(dataset)
loader = DataLoader(
                    dataset = dataset,
                    sampler = sampler,
                   )

         The constructor of DistributedSampler is as follows:

def __init__(self, dataset: Dataset, num_replicas: Optional[int] = None,
                 rank: Optional[int] = None, shuffle: bool = True,
                 seed: int = 0, drop_last: bool = False) -> None:
        if num_replicas is None:
            if not dist.is_available():
                raise RuntimeError("Requires distributed package to be available")
            num_replicas = dist.get_world_size()
        if rank is None:
            if not dist.is_available():
                raise RuntimeError("Requires distributed package to be available")
            rank = dist.get_rank()
        if rank >= num_replicas or rank < 0:
            raise ValueError(
                "Invalid rank {}, rank should be in the interval"
                " [0, {}]".format(rank, num_replicas - 1))
        self.dataset = dataset
        self.num_replicas = num_replicas
        self.rank = rank
        self.epoch = 0
        self.drop_last = drop_last
        # If the dataset length is evenly divisible by # of replicas, then there
        # is no need to drop any data, since the dataset will be split equally.
        if self.drop_last and len(self.dataset) % self.num_replicas != 0:  # type: ignore[arg-type]
            # Split to nearest available length that is evenly divisible.
            # This is to ensure each rank receives the same amount of data when
            # using this Sampler.
            self.num_samples = math.ceil(
                (len(self.dataset) - self.num_replicas) / self.num_replicas  # type: ignore[arg-type]
            )
        else:
            self.num_samples = math.ceil(len(self.dataset) / self.num_replicas)  # type: ignore[arg-type]
        self.total_size = self.num_samples * self.num_replicas
        self.shuffle = shuffle
        self.seed = seed

in:

  1. dataset: type istorch.utils.data.Dataset, which is the object that the sampler needs to process;
  2. num_replicas: Divide the data set into several pieces, the default is None, and will be judged in subsequent code;
  3. Rank: Indicates the rank number of the environment to be processed by this sampler. In a single-machine multi-card environment, it is the number of graphics cards. The default is None, which will be judged in subsequent code;

  4. shuffle: whether to disrupt the order of data;

  5. seed: random number seed, used to disrupt the order;

  6. drop_last: whether to drop the last set of data;

        Note: The parameter shuffle=False is guaranteed in DistributedSampler, and the training set needs to ensure shuffle=True (the default parameter is True). In DataLoader, you need to ensure that both the test data set and the training data set are shuffle=False (the default parameter is False), because there is a sampler for data sampling. Ifshuffle=True, sampling will be performed with the sampler. Conflict, error reported. If it is not DDP, you need to ensure that the shuffle parameter in the dataloader of the training data set is True, and the shuffle parameter in the dataloader of the test data set is False.

         Reference (to be understood in depth):Pytorch - DistributedSampler source code analysis (1) - Zhihu (zhihu.com)

4.11 Understanding RandomSampler()

        RandomSampler() means randomly sampling data.

torch.utils.data.RandomSampler(data_source, replacement=False, num_samples=None, generator=None)

 in:

  1. data_source: represents the sampled data set;
  2. replacement: sampling strategy. If it is True, it means that the replacement sampling strategy is used, and a sample can be sampled repeatedly; if it is False, it means that there is no replacement sampling strategy, that is, a sample can only be sampled once at most;
  3. num_samples: The number of sampled books, all samples are taken by default; when replacement is set to True, the number of samples can be specified, that is, the size of num_samples can be modified; if replacement is set to False, this parameter cannot be modified;
  4. generator: generator in the sampling process;

        Reference:PyTorch study notes: data.RandomSampler - data random sampling_pytorch random sampling_visual cute blog-CSDN blog

4.12 Introduction to argparse module

        ​ ​ argparse is a Python module: a command line option, argument, and subcommand parser.

        manual:

        ①Create a parser:The ArgumentParser object parses the command line into all the information required for Python data types;

parser = argparse.ArgumentParser(description='Process some integers.')
class argparse.ArgumentParser(prog=None, usage=None, description=None, epilog=None, parents=[], formatter_class=argparse.HelpFormatter, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True, allow_abbrev=True)

 in:

  • prog: indicates the name of the program (default: sys.argv[0]);
  • usage: A string representing the usage of the program (default: generated from parameters added to the parser;
  • description: represents the text displayed before the parameter help document (default: None), describing what this program does and how to do it, displayed in the command line usage string and each between help messages for each parameter;
  • epilog: Represents the text displayed after the parameter help document (default value: None);
  • parents: represents a list of ArgumentParser objects, their parameters should also be included;
  • formatter_class: Represents the class used to customize the help document output format;
  • prefix_chars: a set of prefix characters representing optional parameters (default value: '-');
  • fromfile_prefix_chars: A set of prefix characters used to identify the file name when other parameters need to be read from the file (default: None);
  • argument_default: represents the global default value of the parameter (default value: None);
  • conflict_handler: A strategy representing conflict resolution options (usually unnecessary);
  • add_help: means adding a -h/--help option to the parser (default: True);
  • allow_abbrev: Indicates that abbreviation of long options is allowed if the abbreviation is unambiguous (default: True)

        ②Add parameters:Call the add_argument() method to add the program’s parameter information to an ArgumentParser;

# 相当于增加integers属性,后续可以打印args.integers中的内容
parser.add_argument('integers', metavar='N', type=int, nargs='+', help='an integer for the accumulator')
ArgumentParser.add_argument(name or flags...[, action][, nargs][, const][, default][, type][, choices][, required][, help][, metavar][, dest])

 in:

  • name or flags: a name or a list of option strings, such as foo or -f, --foo (you can also abbreviate the parameter name; add "", it can be turned into optional parameter);
  • action: The basic type of action used when the parameter appears on the command line;
  • nargs: the number of command line parameters that should be consumed;
  • const: constant required by some actions and nargs selection;
  • default:The value used when the parameter does not appear on the command line;
  • type:The type that the command line parameters should be converted to;
  • choices: container of available parameters;
  • required: whether this command line option can be omitted (only options are available);
  • help:A brief description of what this option does;
  • metavar: Example of parameter values ​​used in usage method messages;
  • dest: the attribute name added to the object returned by parse_args();

        ③ Pass the parameters to the args instance: Return all the "add_argument" set in the parser to the args subclass instance, then the attribute content added in the parser will be In the args instance, just use it.

args = parser.parse_args()

        ④Parse parameters:ArgumentParser parses parameters through parse_args() method;

>>> parser.parse_args(['--sum', '7', '-1', '42'])
Namespace(accumulate=<built-in function sum>, integers=[7, -1, 42])

        Reference: ①Python lecture room parse_args() detailed explanation_parser.parse_args-CSDN blog; ②  argparse.ArgumentParser() usage analysis - CSDN blog; ③Python's parser.add_argument() usage - command line options, parameters and subcommand parser - CSDN Blog

4.13 Introduction to os.path.join() function

        The os.path.join() function is used to join paths to files, and multiple paths can be passed in.

  • If there is no parameter starting with“ / ”, the function will automatically add it;
  • If there are parameters starting with “ / ”, the last one will start with “ / ” Parameters starting with begin to be spliced, and previous parameters will be discarded;
  • If there are parameters starting with “\” and “ / ”, start with “ / ” is the main one, starting from the last parameter starting with “ / ”, the previous parameters will be discarded. ;
  • If there are only parameters starting with “ ./ ”, it will start from “ ./ ” The previous parameter starting with starts splicing;

        Reference:Detailed explanation of the usage of os.path.join() function_os.path.join function-CSDN blog

4.14 Compose() function

        In Pytorch, Compose is a class in the torchvision.transforms module, which is used to convert multiple Image preprocessingoperations are grouped together so that they are applied sequentially in deep learning tasks. Typically, it is used topreprocess input data such as images for use by neural network models.

import torch
from torchvision import transforms

# Compose创建一个图像预处理管道,其中的参数是个列表,列表中的元素是想执行的transforms操作
transform_data = Compose([
    # 数据平滑处理
    T.LogTransform(k=args.k),
    # 数据归一化
    T.MinMaxNormalize(T.log_transform(ctx['data_min'], k=args.k), T.log_transform(ctx['data_max'], k=args.k))
    ])

        The Compose() class will traverse the transform operations in the transforms list. Part of the implemented code is as follows:

def __call__(self, img):
    for t in self.transforms:   
        img = t(img)
    return img

        Reference:【pytorch】Transforms.Compose() usage - Zhihu (zhihu.com) 

5. Summary

5.1 Existing doubts

  1. I was really confused by the calls of various functions.
  2. The principle of warm-up strategy
  3. Understanding Python classes
  4. Understanding SmoothedValue and MetricLogger
  5. Content loss, style transfer:Neural network style transfer Pytorch_image.to(device)-CSDN Blog

5.2 Schedule for next week

  1. Try to see the visual display of tensorboard
  2. Learning and understanding:[Python] In-depth understanding of self, cls, __call__, __new__, __init__, __del__, __str__, __class__, __doc__, etc._python cls-CSDN Blog
  3. Read the OpenFWI paper to the end
  4. After running the OpenFWI code

(This week's Speechless ctrl+z went back to the draft from the night before. I didn't notice that I could go back to the history and felt sorry for my stupidity)

Guess you like

Origin blog.csdn.net/m0_53096519/article/details/134625424