Question 1

What Is Keras?

Accepted Answer

Keras is a deep learning API that is based on the TensorFlow platform. It was designed to allow fast experimentation and easy model building with multiple graphical processing units (GPUs). Keras is broadly supported and can be used with TensorFlow, CNTK, Theano, MXNet and PlaidML. Learn more about Simplify GPU Sharing in Multi-GPU Environments, AI Part 1.

Question 2

What is Distributed Training with GPUs?

Accepted Answer

eras enables you to distribute your model training tasks over multiple resources, performing training tasks in parallel. Distributed training is an essential part of deep learning. It enables you to leverage multiple CPUs or GPUs and drastically reduces the amount of time needed to train models.

When using distributed training, there are two implementation methods you can choose from—model parallelism and data parallelism. These implementations can be used individually or in combination, depending on your model requirements.

Model parallelism

Model parallelism segments your model into parts that can then be run in parallel. Parts are trained individually and the results of each part are rejoined with the whole.

This method enables you to run each segment on a different resource using the same data. This limits the amount of communication that is needed between workers to only that required for synchronization of shared parameters. You can also use this method with multiple GPUs in a single server.

Model Parallelism Run:AI
Data parallelism

Data parallelism segments your training data into parts that can be run in parallel. Using copies of your model, you run each subset on a different resource. This is the most commonly used type of distributed training.

This method requires that you synchronize model parameters during subset training. If you do not, your prediction errors will not align between subsets. Because of this, data parallelism implementations require communications between workers so changes can be synced.

Question 3

Tips for Managing the Limitations of Multi GPU Training with Keras

Accepted Answer

When using Keras, there are advantages and limitations to your ability to perform multi-GPU training. Below are a few limitations to be aware of and how to handle these limitations.

Keras Multi GPU training is not automatic

Using single GPU configurations with Keras and Tensorflow is straightforward. Provided you are using NVIDIA and you have CUDA libraries installed, use of GPUs is automatic. However, this isn’t the case for scenarios with multiple GPUs.

To use multiple GPUs with Keras, you can use the multi_gpu_model method. This method enables you to copy your model across GPUs. When used, it can automatically split your input across GPUs for aggregation later. However, keep in mind that this method does not scale linearly with the number of GPUs due to the synchronization required.

Saving your parallel models

Once your training is finished, you may want to persist your training weights. Unfortunately, you can’t just use the save()method because Keras does not support saving parallel models.

To get around this, you can either call save()on the original model reference or you can serialize your model. The former automatically updates your weights, while the latter requires some manual clean up of synchronization connections.

GPU data bottlenecks

Often, preprocessing calculations are the most expensive aspect of training deep learning models. These calculations require data to be preprocessed in your CPUs and then fed to the GPUs. This goes smoothly as long as preprocessing is relatively simple and data isn’t bottlenecked in the CPU. If it is, your GPUs are left sitting idle while waiting for data to process.

While Keras can perform your preprocessing calculations in parallel, this is bottlenecked by Python’s Global Interpreter Lock (GIL), which prevents true multithreading. The easiest way to manage this is to simplify your preprocessing as much as possible.

You can typically do this using standard generators. However, if you need to use custom generators, try to offset some of the work with other libraries, like Numpy. These libraries can release the GIL and enable you to access a greater degree of parallelism.

Keras Multi GPU

A Practical Guide

Related Articles

What Is Keras?

What is Distributed Training with GPUs?

Keras Multi GPU and Distributed Training

tf.data Performance Tips

Tips for Managing the Limitations of Multi GPU Training with Keras

Keras Multi GPU With Run:AI