Keras FAQ: Frequently asked questions

Keras FAQ: Frequently asked questions


 

How to quote Keras?

If Keras contributes to your research, please cite it in your publication. The following are examples of BibTeX entry references:

@misc{chollet2015keras,
  title={Keras},
  author={Chollet, Fran\c{c}ois and others},
  year={2015},
  publisher={GitHub},
  howpublished={\url{https://github.com/keras-team/keras}},
}

How to run Keras on GPU?

If you are running with TensorFlow or CNTK backend, as long as any available GPU is detected, the code will automatically run on the GPU.

If you are running with Theano backend, you can use one of the following methods:

Method 1 : Use Theano flags.

THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py

"gpu" may need to be changed according to your device identifier (eg gpu0, gpu1, etc.).

Method 2 : Create  .theanorcGuided tutorial

Method 3 : manually set at the beginning of the code  theano.config.devicetheano.config.floatX:

import theano
theano.config.device = 'gpu'
theano.config.floatX = 'float32'

How to run Keras model on multiple GPUs?

We recommend using the TensorFlow backend to perform this task. There are two ways to run a single model on multiple GPUs: data parallel and device parallel .

In most cases, what you need most is data parallelism.

Data parallelism

Data parallelization involves copying the target model once on each device and using each model copy to process different parts of the input data. Keras has a built-in utility function  keras.utils.multi_gpu_modelthat can generate data parallel versions of any model and achieve quasi-linear acceleration on up to 8 GPUs.

For more information, see   the documentation of multi_gpu_model . Here is a quick example:

from keras.utils import multi_gpu_model

# 将 `model` 复制到 8 个 GPU 上。
# 假定你的机器有 8 个可用的 GPU。
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')

# 这个 `fit` 调用将分布在 8 个 GPU 上。
# 由于 batch size 为 256,每个 GPU 将处理 32 个样本。
parallel_model.fit(x, y, epochs=20, batch_size=256)

Device parallel

Device parallelism includes running different parts of the same model on different devices. For a model with a parallel architecture, such as a model with two branches, this approach is suitable.

This parallelism can be achieved by using TensorFlow device scopes. Here is a simple example:

# 模型中共享的 LSTM 用于并行编码两个不同的序列
input_a = keras.Input(shape=(140, 256))
input_b = keras.Input(shape=(140, 256))

shared_lstm = keras.layers.LSTM(64)

# 在一个 GPU 上处理第一个序列
with tf.device_scope('/gpu:0'):
    encoded_a = shared_lstm(tweet_a)
# 在另一个 GPU上 处理下一个序列
with tf.device_scope('/gpu:1'):
    encoded_b = shared_lstm(tweet_b)

# 在 CPU 上连接结果
with tf.device_scope('/cpu:0'):
    merged_vector = keras.layers.concatenate([encoded_a, encoded_b],
                                             axis=-1)
130 original articles published · Like 30 · Visits 40,000+

Guess you like

Origin blog.csdn.net/W_H_M_2018/article/details/105569693