Question 1

What GPU Options are Offered on Google Cloud?

Accepted Answer

Google Cloud Platform (GCP) is the world’s third largest cloud provider. Google offers a number of virtual machines (VMs) that provide graphical processing units (GPUs), including the NVIDIA Tesla K80, P4, T4, P100, and V100. You can use NVIDIA GPUs on GCP for large scale cloud deep learning projects, analytics, physical object simulation, video transcoding, and molecular modeling. GCP also provides virtual NVIDIA GRID workstations, which can let an organization’s employees run graphics-intensive workloads remotely. Learn more about Google TPU.

Question 2

Google Cloud GPU Options

Accepted Answer

Google Cloud provides several GPU options. These GPUs can be selected as part of two Google instance types: Accelerator-Optimized High-GPU with 7 GB of RAM, 12–96 Cascade Lake CPUs, and SSD storage Accelerator-Optimized Mega-GPU with 14 GB of RAM, 96 Cascade Lake CPUs, and SSD storage . Learn more about Cloud Deep Learning.

Question 3

Optimizing Google Cloud Platform GPU Performance

Accepted Answer

Here are two tips that can help you improve GPU performance in a Google Cloud VM.

Disabling Autoboost and Setting Maximum Clock Frequency
Autoboost is a feature in GPUs of the NVIDIA Tesla K80 series. It automatically adjusts clock frequency to determine the best frequency for your particular application. However, constantly adjusting the clock frequency will also reduce GPU performance when running on Google infrastructure.

If you're running an NVIDIA Tesla K80 GPU on Compute Engine, it is recommended to disable auto boost, using the following command (in Linux):

sudo nvidia-smi --auto-boost-default=DISABLED

When using Tesla K80, you should also set the GPU clock speed to the highest frequency, using this command:

sudo nvidia-smi --applications-clocks=2505,875

Using Maximal Network Bandwidth—Up to 100 Gbps
To make distributed workloads run faster with NVIDIA Tesla T4 or V100, use the maximum network bandwidth of 100 Gbps, as follows:

Make sure you meet the minimal system requirements to use maximum network bandwidth (see documentation).
Create a VM instance connected to a T4 or V100 GPU. The image used to create the VM instance must have the virtual network interface (gVNIC).
After creating the virtual machine instance, check the actual network bandwidth consumption, using iperf or a similar tool. You’ll need at least two instances of the VMs with connected GPUs.
See additional best practices from Google for using the maximum 100 Gbps bandwidth.

Google Cloud GPU

The Basics and a Quick Tutorial

What GPU Options are Offered on Google Cloud?

Google Cloud GPU Options

Google Cloud TPU

Working with GPUs on Google Cloud Compute Engine

Optimizing Google Cloud Platform GPU Performance

Disabling Autoboost and Setting Maximum Clock Frequency

Using Maximal Network Bandwidth—Up to 100 Gbps

Google Cloud GPU with Run:AI