I am building a neural network on Keras, including multiple layers of LSTM, Permute and Dense.
It seems LSTM is GPU-unfriendly. So I did research and use
With tf.device('/cpu:0'): out = LSTM(cells)(inp)
But based on my understanding about with
, with
is try...finally
block to ensure that clean-up code is executed. I don’t know whether the following CPU/GPU mixture usage code works or not? Will they accelerate speed of training?
With tf.device('/cpu:0'): out = LSTM(cells)(inp) With tf.device('/gpu:0'): out = Permute(some_shape)(out) With tf.device('/cpu:0'): out = LSTM(cells)(out) With tf.device('/gpu:0'): out = Dense(output_size)(out)
Advertisement
Answer
As you may read here – tf.device
is a context manager which switches a default device to this passed as its argument in a context (block) created by it. So this code should run all '/cpu:0'
device at CPU
and rest on GPU
.
The question will it speed up your training is really hard to answer because it depends on the machine you use – but I don’t expect computations to be faster as each change of a device makes data to be copied between GPU RAM
and machine RAM
. This could even slow down your computations.