tensorflow の問題を記録する (続き) 当初は CUDA のバージョン互換性の問題と推測されますが、時間があるときにゆっくり調整していきます。

背景：

オペレーティングシステム: win10
グラフィックカード: NVIDIA RTX 3070
CUDA: cuda_10.0.130_411.31_win10
cuDNN: cudnn-10.0-windows10-x64-v7.4.2.24
ANACONDA: Anaconda3-5.3.1-Windows-x86_64 (Python3 が組み込まれています。 7）

インストールの成功:
ここに画像の説明を挿入
アプレット 1 のテスト:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

実行結果は次のようになります。

C:\Users\86157\Anaconda3\python.exe D:/tensorflow安装测试2.py
2021-07-14 11:07:25.266124: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-07-14 11:07:25.270135: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2021-07-14 11:07:25.294260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 11:07:25.294448: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 11:07:25.294624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
    
    
}
incarnation: 5368899334305722603
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 6878960640
locality {
    
    
  bus_id: 1
  links {
    
    
  }
}
incarnation: 6692056173094645463
physical_device_desc: "device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6"
]
2021-07-14 11:08:14.609377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-14 11:08:14.609521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-07-14 11:08:14.609599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-07-14 11:08:14.609829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6560 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)

Process finished with exit code 0

ここに画像の説明を挿入
テストアプレット 2:

import os
from tensorflow.python.client import device_lib
os.environ["TF_CPP_MIN_LOG_LEVEL"]="99"

if __name__=="__main__":
    print(device_lib.list_local_devices())

実行結果は次のようになります。
ここに画像の説明を挿入

プログラム 1 (テスト成功)

import tensorflow as tf
#import os
#os.environ['CUDA_VISIBLE_DEVICES']='0' #指定使用第0块显卡

v=tf.Variable([1,2,3])
print(v)

with tf.Session() as sess:
    sess.run(v.initializer)
    print(sess.run(v))

実行結果は次のようになります。

C:\Users\86157\Anaconda3\python.exe C:/Users/86157/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_2.py
<tf.Variable 'Variable:0' shape=(3,) dtype=int32_ref>
WARNING:tensorflow:From C:/Users/86157/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_2.py:10: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-07-14 10:44:39.300239: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2021-07-14 10:44:39.330485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:44:39.330672: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:44:39.330832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-07-14 10:44:39.331129: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-07-14 10:44:39.335371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:44:39.335580: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:44:39.335727: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
[1 2 3]
2021-07-14 10:45:08.862092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-14 10:45:08.862229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-07-14 10:45:08.862307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-07-14 10:45:08.862507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6684 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)

Process finished with exit code 0

ここに画像の説明を挿入

プログラム 2 (テスト成功)


import tensorflow as tf  # 用来构造神经网络
import numpy as np  # 用来构造数据结构和处理数据模块
from time import time

# 定义一个层
def add_layer(inputs, in_size, out_size, activation_function=None):
    # 定义一个层，其中inputs为输入，in_size为上一层神经元数，out_size为该层神经元数
    # activation_function为激励函数
    Weights = tf.Variable(tf.random_normal([in_size, out_size]))
    # 初始权重随机生成比较好，in_size，out_size为该权重维度
    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)
    # 偏置
    Wx_plus_b = tf.matmul(inputs, Weights) + biases
    # matmul为矩阵里的函数相乘
    if activation_function is None:
        outputs = Wx_plus_b  # 如果激活函数为空，则不激活，保持数据
    else:
        outputs = activation_function(Wx_plus_b)
        # 如果激活函数不为空，则激活，并且返回激活后的值
    return outputs  # 返回激活后的值


# 构造一些样本，用来训练神经网络
x_data = np.linspace(-1, 1, 300)[:, np.newaxis]
# 值为（-1，1）之间的数，有300个
noise = np.random.normal(0, 0.05, x_data.shape)
# 加入噪声会更贴近真实情况，噪声的值为（0，0.05）之间，结构为x_date一样
y_data = np.square(x_data) - 0.5 + noise
# y的结构


# 定义placeholder用来输入数据到神经网络，其中1表只有一个特征，也就是维度为一维数据
xs = tf.placeholder(tf.float32, [None, 1])
ys = tf.placeholder(tf.float32, [None, 1])
# add hidden layer
l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)
# add output layer
prediction = add_layer(l1, 10, 1, activation_function=None)

loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),
                                    reduction_indices=[1]))
# 代价函数，reduce_mean为求均值，reduce_sum为求和，reduction_indices为数据处理的维度
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)
# 将代价函数传到梯度下降，学习速率为0.1，这里包含权重的训练，会更新权重
startTime1 = time()
# important step
# tf.initialize_all_variables() no long valid from
# 2017-03-02 if using tensorflow >= 0.12
# 变量初始化
if int((tf.__version__).split('.')[1]) < 12:
    init = tf.initialize_all_variables()
else:
    init = tf.global_variables_initializer()
sess = tf.Session()  # 打开TensorFlow
sess.run(init)  # 执行变量初始化

for i in range(1000):  # 梯度下降迭代一千次
    # training
    sess.run(train_step, feed_dict={
    
    xs: x_data, ys: y_data})
    # 执行梯度下降算法，并且将样本喂给损失函数
    if i % 50 == 0:
        # 每50次迭代输出代价函数的值
        print(sess.run(loss, feed_dict={
    
    xs: x_data, ys: y_data}))

t1 = time() - startTime1
print('使用gpu花的时间：', t1)

実行結果は次のようになります。

2021-07-14 10:47:53.270411: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2021-07-14 10:47:53.301139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:47:53.301329: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:47:53.301489: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-07-14 10:47:53.301789: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-07-14 10:47:53.305852: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:47:53.306040: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:47:53.306191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-07-14 10:48:19.191315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-14 10:48:19.191456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-07-14 10:48:19.191535: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-07-14 10:48:19.191729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6684 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)
0.21629128

ここに画像の説明を挿入
プログラム 3
(1)

import tensorflow.compat.v1 as tf

tf.disable_v2_behavior()
# 需要datetime
from datetime import datetime

argv = ("gpu", 300)
device_name = argv[0]  # Choose device from cmd line. Options: gpu or cpu
# device_name = "gpu"
shape = (int(argv[1]), int(argv[1]))
if device_name == "gpu":
    device_name = "/gpu:0"
else:
    device_name = "/cpu:0"

with tf.device(device_name):
    random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1)
    dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
    sum_operation = tf.reduce_sum(dot_operation)

startTime = datetime.now()
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
    result = session.run(sum_operation)
    print(result)

# It can be hard to see the results on the terminal with lots of output -- add some newlines to improve readability.
print("\n" * 5)
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)
print("\n" * 5)

実行結果は次のようになります。

C:\Users\86157\Anaconda3\python.exe C:/Users/86157/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_1.py
WARNING:tensorflow:From C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\compat\v2_compat.py:61: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2021-07-14 10:52:17.451647: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-07-14 10:52:17.456305: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2021-07-14 10:52:17.480601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:52:17.480787: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:52:17.480947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-07-14 10:53:23.121947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-14 10:53:23.122130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-07-14 10:53:23.122211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-07-14 10:53:23.122402: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6684 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6
random_uniform/RandomUniform: (RandomUniform): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/sub: (Sub): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/mul: (Mul): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform: (Add): /job:localhost/replica:0/task:0/device:GPU:0
transpose: (Transpose): /job:localhost/replica:0/task:0/device:GPU:0
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
Sum: (Sum): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/shape: (Const): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/min: (Const): /job:localhost/replica:0/task:0/device:GPU:0
random_uniform/max: (Const): /job:localhost/replica:0/task:0/device:GPU:0
transpose/perm: (Const): /job:localhost/replica:0/task:0/device:GPU:0
Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.126859: I tensorflow/core/common_runtime/direct_session.cc:296] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6

2021-07-14 10:53:23.127914: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/RandomUniform: (RandomUniform)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.128127: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/sub: (Sub)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.128311: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/mul: (Mul)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.128495: I tensorflow/core/common_runtime/placer.cc:54] random_uniform: (Add)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.128676: I tensorflow/core/common_runtime/placer.cc:54] transpose: (Transpose)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.128850: I tensorflow/core/common_runtime/placer.cc:54] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129018: I tensorflow/core/common_runtime/placer.cc:54] Sum: (Sum)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129196: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/shape: (Const)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129387: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/min: (Const)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129577: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/max: (Const)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129764: I tensorflow/core/common_runtime/placer.cc:54] transpose/perm: (Const)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.129938: I tensorflow/core/common_runtime/placer.cc:54] Const: (Const)/job:localhost/replica:0/task:0/device:GPU:0
2021-07-14 10:53:23.580540: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
    return fn(*args)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas GEMM launch failed : a.shape=(300, 300), b.shape=(300, 300), m=300, n=300, k=300
	 [[{
    
    {
    
    node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]
  (1) Internal: Blas GEMM launch failed : a.shape=(300, 300), b.shape=(300, 300), m=300, n=300, k=300
	 [[{
    
    {
    
    node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]
	 [[Sum/_1]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/86157/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_1.py", line 25, in <module>
    result = session.run(sum_operation)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
    run_metadata_ptr)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
    run_metadata)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas GEMM launch failed : a.shape=(300, 300), b.shape=(300, 300), m=300, n=300, k=300
	 [[{
    
    {
    
    node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]
  (1) Internal: Blas GEMM launch failed : a.shape=(300, 300), b.shape=(300, 300), m=300, n=300, k=300
	 [[{
    
    {
    
    node ArithmeticOptimizer/FoldTransposeIntoMatMul_MatMul}}]]
	 [[Sum/_1]]
0 successful operations.
0 derived errors ignored.

Process finished with exit code 1

ここに画像の説明を挿入
(2) 9行目を cpu にすれば

正常に動作します

C:\Users\86157\Anaconda3\python.exe C:/Users/86157/AppData/Roaming/JetBrains/PyCharmCE2020.2/scratches/scratch_1.py
WARNING:tensorflow:From C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\compat\v2_compat.py:61: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2021-07-14 10:55:26.948308: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-07-14 10:55:26.952414: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2021-07-14 10:55:26.976676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.725
pciBusID: 0000:01:00.0
2021-07-14 10:55:26.976862: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2021-07-14 10:55:26.977020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2021-07-14 10:56:13.517681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-14 10:56:13.517825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2021-07-14 10:56:13.517904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2021-07-14 10:56:13.518094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6684 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6
random_uniform/RandomUniform: (RandomUniform): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform/sub: (Sub): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform/mul: (Mul): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform: (Add): /job:localhost/replica:0/task:0/device:CPU:0
transpose: (Transpose): /job:localhost/replica:0/task:0/device:CPU:0
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:CPU:0
Sum: (Sum): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform/shape: (Const): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform/min: (Const): /job:localhost/replica:0/task:0/device:CPU:0
random_uniform/max: (Const): /job:localhost/replica:0/task:0/device:CPU:0
transpose/perm: (Const): /job:localhost/replica:0/task:0/device:CPU:0
Const: (Const): /job:localhost/replica:0/task:0/device:CPU:0
6761302.0





2021-07-14 10:56:13.522171: I tensorflow/core/common_runtime/direct_session.cc:296] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6

2021-07-14 10:56:13.523042: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/RandomUniform: (RandomUniform)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.523255: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/sub: (Sub)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.523449: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/mul: (Mul)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.523644: I tensorflow/core/common_runtime/placer.cc:54] random_uniform: (Add)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.523832: I tensorflow/core/common_runtime/placer.cc:54] transpose: (Transpose)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524015: I tensorflow/core/common_runtime/placer.cc:54] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524187: I tensorflow/core/common_runtime/placer.cc:54] Sum: (Sum)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524372: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/shape: (Const)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524573: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/min: (Const)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524769: I tensorflow/core/common_runtime/placer.cc:54] random_uniform/max: (Const)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.524963: I tensorflow/core/common_runtime/placer.cc:54] transpose/perm: (Const)/job:localhost/replica:0/task:0/device:CPU:0
2021-07-14 10:56:13.525146: I tensorflow/core/common_runtime/placer.cc:54] Const: (Const)/job:localhost/replica:0/task:0/device:CPU:0

Shape: (300, 300) Device: /cpu:0
Time taken: 0:00:46.600914

ここに画像の説明を挿入
プログラム 4 (実行失敗) jupyter で実験 7 を実行

実行結果は次のようになります。

---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1355     try:
-> 1356       return fn(*args)
   1357     except errors.OpError as e:

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1340       return self._call_tf_sessionrun(
-> 1341           options, feed_dict, fetch_list, target_list, run_metadata)
   1342 

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1428         self._session, options, feed_dict, fetch_list, target_list,
-> 1429         run_metadata)
   1430 

InternalError: Blas GEMM launch failed : a.shape=(128, 784), b.shape=(128, 10), m=784, n=10, k=128
	 [[{
    
    {
    
    node gradients/MatMul_grad/MatMul_1}}]]

During handling of the above exception, another exception occurred:

InternalError                             Traceback (most recent call last)
<ipython-input-10-9fc11b71816d> in <module>()
     14         for _ in range(n_batches):
     15             X_batch, Y_batch = mnist.train.next_batch(batch_size)
---> 16             _, loss_batch = sess.run([optimizer, loss], feed_dict={
    
    X: X_batch, Y:Y_batch})
     17             total_loss += loss_batch
     18         print('Average loss epoch {0}: {1}'.format(i, total_loss/n_batches))

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
    948     try:
    949       result = self._run(None, fetches, feed_dict, options_ptr,
--> 950                          run_metadata_ptr)
    951       if run_metadata:
    952         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1171     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1172       results = self._do_run(handle, final_targets, final_fetches,
-> 1173                              feed_dict_tensor, options, run_metadata)
   1174     else:
   1175       results = []

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1348     if handle is None:
   1349       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1350                            run_metadata)
   1351     else:
   1352       return self._do_call(_prun_fn, handle, feeds, fetches)

~\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
   1368           pass
   1369       message = error_interpolation.interpolate(message, self._graph)
-> 1370       raise type(e)(node_def, op, message)
   1371 
   1372   def _extend_graph(self):

InternalError: Blas GEMM launch failed : a.shape=(128, 784), b.shape=(128, 10), m=784, n=10, k=128
	 [[node gradients/MatMul_grad/MatMul_1 (defined at <ipython-input-9-baf368680732>:2) ]]

Errors may have originated from an input operation.
Input Source operations connected to node gradients/MatMul_grad/MatMul_1:
 X_placeholder (defined at <ipython-input-5-12a76c69b468>:2)

Original stack trace for 'gradients/MatMul_grad/MatMul_1':
  File "C:\Users\86157\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\86157\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "C:\Users\86157\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 499, in start
    self.io_loop.start()
  File "C:\Users\86157\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "C:\Users\86157\Anaconda3\lib\asyncio\base_events.py", line 523, in run_forever
    self._run_once()
  File "C:\Users\86157\Anaconda3\lib\asyncio\base_events.py", line 1758, in _run_once
    handle._run()
  File "C:\Users\86157\Anaconda3\lib\asyncio\events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 122, in _handle_events
    handler_func(fileobj, events)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "C:\Users\86157\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\Users\86157\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tornado\stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 233, in dispatch_shell
    handler(stream, idents, msg)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 208, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\Users\86157\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 537, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2662, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "C:\Users\86157\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2785, in _run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\Users\86157\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2901, in run_ast_nodes
    if self.run_code(code, result):
  File "C:\Users\86157\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2961, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-9-baf368680732>", line 2, in <module>
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\training\optimizer.py", line 403, in minimize
    grad_loss=grad_loss)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\training\optimizer.py", line 512, in compute_gradients
    colocate_gradients_with_ops=colocate_gradients_with_ops)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py", line 158, in gradients
    unconnected_gradients)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 731, in _GradientsHelper
    lambda: grad_fn(op, *out_grads))
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 403, in _MaybeCompile
    return grad_fn()  # Exit early
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_util.py", line 731, in <lambda>
    lambda: grad_fn(op, *out_grads))
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\math_grad.py", line 1388, in _MatMulGrad
    grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6295, in mat_mul
    name=name)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

...which was originally created as op 'MatMul', defined at:
  File "C:\Users\86157\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
[elided 22 identical lines from previous traceback]
  File "C:\Users\86157\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2961, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-9252bf743de0>", line 1, in <module>
    logits = tf.matmul(X, w) + b
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 2647, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6295, in mat_mul
    name=name)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\86157\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

ここに画像の説明を挿入

tensorflow の問題を記録する (続き) 当初は CUDA のバージョン互換性の問題と推測されますが、時間があるときにゆっくり調整していきます。

背景：

おすすめ