FPGA for Deep Learning

Build Your Own Accelerator

Field-programmable gate array (FPGA) chips enable you to reprogram logic gates. You can use FPGA technology to overwrite chip configurations and create custom circuits. FPGA chips are especially useful for machine learning and deep learning. For example, using FPGA for deep learning enables you to optimize throughput and adapt processors to meet the specific needs of different deep learning architectures.

In this article, you will learn:

This is part of an extensive series of guides about Deep Learning with GPUs

What is FPGA?

A field-programmable gate array (FPGA) is a hardware circuit with reprogrammable logic gates. It enables users to create a custom circuit while the chip is deployed in the field (not only during the design or fabrication phase), by overwriting a chip’s configurations. This is different from regular chips which are fully baked and cannot be reprogrammed.

With an FPGA chip you can create everything from simple, single function logic gates to multi-core processors. Common uses for FPGAs include space exploration, defense systems, telecommunications, image processing, high performance computing (HPC), and networking.

What are FPGAs used for?

There is a wide range of FPGA applications. You can configure an FPGA with thousands of memory units. This enables the circuits to work in a massively-parallel computing model, like GPUs. With FPGAs, you gain access to an adaptable architecture that enables you to optimize throughput. This means you can use FPGAs to meet or exceed the performance of GPUs.

Why use FPGA?

Compared to CPUs and GPUs, FPGAs are well suited for embedded applications and have lower power consumption. These circuits can be used with custom data types and are not limited by architecture like GPUs. Also, the programmability of FPGAs make it easier to adapt them for safety and security concerns. FPGA have been successfully used in safety-critical critical, regulated environments like ADAS (Advanced Driver Assistance Systems).

GPU vs FPGA for Machine Learning

When deciding between GPUs and FPGAs you need to understand how the two compare. Below are some of the biggest differences between GPU and FPGA for machine and deep learning.

Compute power

According to research by Xilinx, FPGAs can produce roughly the same or greater compute power as comparable GPUs. FPGAs also have better on-chip memory, resulting in higher compute capability. This memory reduces bottlenecks caused by external memory access and reduces the cost and power required for high memory bandwidth solutions.

In computations, FPGAs can support a full range of data types, including FTP32, INT8, binary, and custom types. With FPGAs you can make modifications as needed while GPUs require vendors to adapt architectures to provide compatibility. This may mean pausing projects while vendors make changes.

Efficiency and power

According to research by Microsoft, FPGAs can perform almost 10x better than GPUs in terms of power consumption. The reason for this is that GPUs require complex compute resources to enable software programmability, which consumes more power.

This doesn’t mean that all GPUs are less efficient. The NVIDIA V100 has been found to provide efficiency comparable to Xilinx FPGAs for deep learning tasks. This is due to its hardened Tensor Cores. However, for general purpose workloads this GPU isn’t comparable. Learn more in our article about NVIDIA deep learning GPU.

Functional safety

GPUs were designed for high-performance computing systems and graphics workloads. Safety concerns were not relevant. However, GPUs have been used in applications, like ADAS, where functional safety is a concern. In these cases, GPUs must be designed to meet safety requirements, which can be time-consuming for vendors.

In contrast, the programmability of FPGAs enables you to design them in a way that meets whatever safety requirements you face. These circuits have been successfully used in automation, avionics, and defense without custom manufacturing requirements.

For more information about GPU types and vendors, check out our article about the best GPU for deep learning.

What Are the Pros and Cons of Using FPGA for Deep Learning?

You can use FPGAs to accelerate your deep learning workloads and gain significant benefit over GPUs. However, these circuits aren’t perfect and come with some cons you should be aware of. Understanding both the positives and negatives of FPGAs can help you ensure that you implement the technology carefully and with the greatest ROI.

Advantages of FPGA technology include:

  • Flexibility—reprogrammability is the greatest benefit of FPGA for deep learning, and adds significant flexibility to operations. You can program individual blocks or your entire circuit to fit the requirements of your particular algorithm. If the programming doesn’t fit as well as you expected, you can modify it as needed.
  • Parallelism—you can switch between programs to adapt to changing workloads with an FPGA. You can also handle multiple workloads without sacrificing performance. This enables you to work on different stages of tasks concurrently which you can’t do with GPUs.
  • Decreased latency—larger memory bandwidths result in lower latency than GPUs. This enables you to process significant amounts of data in real-time, including streaming data. Additionally, FPGAs can provide extremely precise timing and reliability without sacrificing flexibility.
  • Energy efficiency—lower power requirements for FPGAs can help reduce overall power consumption for machine learning and deep learning implementations. This can reduce the overall costs of training and potentially extend the life of equipment.

Disadvantages of FPGA technology include:

  • Programming—programming FPGA circuits requires significant expertise and that is not easy to obtain. For example, programmers must be familiar with hardware descriptive language (HDL). Lack of experienced programmers can make it difficult to adopt FPGAs reliably.
  • Implementation complexity—implementing FPGAs for deep learning is relatively untested and may be too risky for conservative organizations. Lack of support and minimal community knowledge means that FPGAs are not yet widely accessible for deep learning applications.
  • Expense—the cost of the FPGAs themselves in combination with implementation and programming costs make the circuits a considerable investment. This technology is currently ill-suited for smaller projects and requires the investment given to larger and on-going implementations.
  • Lack of libraries—currently, there are very few if any deep learning libraries that support FPGA without modification. There is, however, a project that researchers from the University of British Columbia are working on, called LeFlow. This project is attempting to create compatibility between FPGAs and TensorFlow.

FPGA for Deep Learning With Run:AI

Run:AI automates resource management and workload orchestration for machine learning infrastructure. With Run:AI, you can automatically run as many compute intensive experiments as needed.

Here are some of the capabilities you gain when using Run:AI:

  • Advanced visibility—create an efficient pipeline of resource sharing by pooling GPU compute resources.
  • No more bottlenecks—you can set up guaranteed quotas of GPU resources, to avoid bottlenecks and optimize billing.
  • A higher level of control—Run:AI enables you to dynamically change resource allocation, ensuring each job gets the resources it needs at any given time.

Run:AI simplifies machine learning infrastructure pipelines, helping data scientists accelerate their productivity and the quality of their models.

Learn more about the Run:AI GPU virtualization platform.