这两天新搬办公室,网络不好用,将就了。
博客园也上不了,github也上不了了,
工作效率降低不少。
今天遇到同事使用rasa用机器人项目的问题,
一个4核的Tesla K80 GPU,
性能照说不差,但一运行rasa train,或是rasa run,或是rasa shell,
指不定什么时候就报如下错误:
2019-11-06 07:08:28 WARNING root - Sequence length will auto set at 95% of sequence length 2019-11-06 07:08:37.627154: F tensorflow/stream_executor/cuda/cuda_driver.cc:175] Check failed: err == cudaSuccess || err == cudaErrorInvalidValue Unexpected CUDA error: out of memory
后来,使用nvidia-smi命令查看,原来还是一个经典问题,就是很多项目如果都在用同一个GPU核的话,
那很容易报内存错误。
于是,更改了kashgari库使用GPU的配置,搞定问题。
如下配置,是把kashgari运行的tf运行GPU不使用默认的,而使用指定的第4个GPU。
site-packages/kashgari/__init__.py
# encoding: utf-8 """ @author: BrikerMan @contact: [email protected] @blog: https://eliyar.biz @version: 1.0 @license: Apache Licence @file: __init__.py @time: 2019-05-17 11:15 """ import os os.environ['TF_KERAS'] = '1' os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' **os.environ['CUDA_VISIBLE_DEVICES']="3"** import tensorflow as tf tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR) import keras_bert from kashgari.macros import TaskType, config custom_objects = keras_bert.get_custom_objects() CLASSIFICATION = TaskType.CLASSIFICATION LABELING = TaskType.LABELING from kashgari.version import __version__ from kashgari import layers from kashgari import corpus from kashgari import embeddings from kashgari import macros from kashgari import processors from kashgari import tasks from kashgari import utils from kashgari import callbacks from kashgari import migeration migeration.show_migration_guide()