torch distributed.init out of memory

企业开发 2022-05-13 00:01:02 阅读次数: 0

设置环境gpu：

os.environ["CUDA_VISIBLE_DEVICES"] = "1, 2, 3"

local_rank=0

torch.cuda.set_device(local_rank)

cuda(0)默认是第0块显卡，

但是设置CUDA_VISIBLE_DEVICES后：

cuda(0)就是CUDA_VISIBLE_DEVICES里面的第一个gpu。

distributed.init 报错out of memory

import argparse
import logging
import os
import time

import torch
import torch.distributed as dist
import torch.nn.functional as F
import torch.utils.data.distributed
def main(args):
    try:
        world_size = int(os.environ['WORLD_SIZE'])
        rank = int(os.environ['RANK'])
        dist_url = "tcp://{}:{}".format(os.environ["MASTER_ADDR"], os.environ["MASTER_PORT"])
    except KeyError:
        world_size = 1
        rank = 0
        dist_url = "tcp://127.0.0.1

猜你喜欢

转载自blog.csdn.net/jacke121/article/details/124748293

torch distributed.init out of memory

torch.cuda.OutOfMemoryError: CUDA out of memory.

解决：torch.cuda.OutOfMemoryError: CUDA out of memory.

Runtime Error：cuda runtime error(2):out of memory at /……/torch/…….cu：66

成功解决torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0； 2.00 GiB to

linux out of memory分析

out of memory for load bitmap

（二）OOM（Out Of Memory）

android out of memory(OOM)

idea out of memory 错误

【PyCharm】 out of memory

Linux Out of memory error

android studio out of memory

Delphi out of memory的问题

pytorch CUDA out of memory

ERROR: out of memory

Redis:fork: Out of memory

docker被killed out of memory

FDQuery Out of memory

Out Of Memory Error

解决CUDA out of memory

CUDA out of memory in pytorch

Navicat for MySQL out of memory

Java Out of Memory Error

RuntimeError: CUDA out of memory

[转]android out of memory(OOM)

Android 编译 Out of memory error

pycharm 遇到out of memory 问题

ORA-27102: out of memory

今日推荐

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

开源日报 | 中学生开源前端动画引擎；全球首个Llama3 8B中文版开源模型；联想电脑恐出局；Linus讽刺AI炒作

“百模大战”必有一战 | 2024中国“百模大战”竞争格局分析

最强开源大模型 Llama 3 上架 Gitee AI

虽然老乡鸡开源的不是代码，但背后的原因却让人很暖心

富文本编辑器 Quill 2.0 重磅发布，特性、可靠性与开发者体验大幅提升

周排行

使用Redis中间件解决商品秒杀活动中出现的超卖问题（使用Java多线程模拟高并发环境）

野指针及c++指针使用注意点

redis 3.0　新特性

(翻译)火狐操作系统javascript API

微信小程序开发入门

mysql数据查询之五子句(where、group by、having、order by和limit)

Codeforces Round #517 Div. 1翻车记

在caffe 中实现Generative Adversarial Nets（二）

企业级漏洞扫描工具

java byte数组与String互转

每日归档

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)

2024-04-18(0)

2024-04-17(5)

2024-04-16(70)

2024-04-15(42)

2024-04-14(0)