Question 1

How Do You Use Convolutional Neural Networks (CNN) in PyTorch?

Accepted Answer

PyTorch is a Python framework for deep learning that makes it easy to perform research projects, leveraging CPU or GPU hardware. The basic logical unit in PyTorch is a tensor, a multidimensional array. PyTorch combines large numbers of tensors into computational graphs, and uses them to construct, train and run neural network architectures. A unique feature of PyTorch is that graphs are dynamic, written directly in Python, and can be modified during runtime.

Convolutional Neural Networks (CNN) are the basic architecture used in deep learning for computer vision. The Torch.nn library provides built in functions that can create all the building blocks of CNN architectures:

Convolution layers
Pooling layers
Padding layers
Activation functions
Loss functions
Fully connected layers

Question 2

How Do CNNs Work?

Accepted Answer

A convolutional neural network (CNN for short) is a special type of neural network model primarily designed to process 2D image data, but which can also be used with 1D and 3D data.

At the core of a convolutional neural network are two or more convolutional layers, which perform a mathematical operation called a “convolution”. The convolution multiplies a set of weights with the inputs of the neural network. However, unlike in a regular neural network, this multiplication happens using a “window” that passes over the image, called a filter or kernel. As the filter passes over the image, each time the weights are multiplied by a specific set of input values.

The mathematical operation performed during the convolution operation is a “dot product”. This is an element-wise multiplication between the weights in the filter and the input values. The total is summed, giving a single value for each filter position. This operation is also called a “scalar product”.

Because the filter is usually smaller than the image used as an input, the same weights can be applied to the input multiple times. Specifically, the system applies the filter from right to left and from top to bottom to cover the entire image, with the objective of discovering important features in the image.

It is a powerful idea to constantly apply the same filter to the whole image. If the filter can identify certain features in the image, it reviews the entire image and looks for that feature everywhere. This is called translation invariance—the CNN architecture is mainly interested in the presence of a feature, rather than its specific location.

The values obtained from the convolution operation for each filter position (one value for each filter position) create a two-dimensional matrix of output values, which represent the features extracted from the underlying image. This output matrix is called a “feature map”.

Once the feature map is ready, any value in the functional map can be transmitted nonlinearly to the next convolutional layer (for example, via ReLU activation). The output of the convolutional layers sequence is transmitted to fully connected layers, which produce the final prediction, typically regarding a label describing the image.

Question 3

Implementing CNNs Using PyTorch

Accepted Answer

We use a very simple CNN architecture, with only two convolutional layers to extract features from the image. Afterwards we’ll use a fully connected layer to classify the features into labels.

We use the Sequential() function to define the layers of the model in order, from input to final prediction.

PyTorch CNN

The Basics and a Quick Tutorial

Related Articles

How Do You Use Convolutional Neural Networks (CNN) in PyTorch?

How Do CNNs Work?

Quick Tutorial: Building a Basic CNN with PyTorch

Prerequisites

Creating a Validation Set

Implementing CNNs Using PyTorch

Generating Predictions for the Test Set

PyTorch CNN in Production with Run:AI