TensorFlow CNN

Building Your First CNN with Tensorflow

What is TensorFlow CNN?

Convolutional Neural Networks (CNN), a key technique in deep learning for computer vision, are little-known to the wider public but are the driving force behind major innovations, from unlocking your phone with face recognition to safe driverless vehicles.

CNNs are used for a variety of tasks in computer vision, primarily image classification and object detection. The open source TensorFlow framework allows you to create highly flexible CNN architectures for computer vision tasks. In this article we explain the basics of CNN on TensorFlow and present a quick hands-on tutorial to get you started.

If you are interested in learning how to work with CNNs in PyTorch, which is another popular deep learning framework, see our guide to Pytorch CNN.

In this article, you will learn:

CNN on TensorFlow Concepts

Tensor

Tensors represent deep learning data. They are multidimensional arrays, used to store multiple dimensions of a dataset. Each dimension is called a feature. For example, a cube storing data across an X, Y, and Z access is represented as a 3-dimensional tensor. Tensors can store very high dimensionality, with hundreds of dimensions of features typically used in deep learning applications.

Computational graph

TensorFlow computational graphs represent the workflows that occur during deep learning model training. For a CNN model, the computational graph can be very complex. The image below demonstrates how a simple graph should look like. You can use TensorBoard, built into TensorFlow, to display the computational graph of your model.

TensorFlow CNN

Constant

In TensorFlow, a constant is used to store values that don’t change during the computation of the model. It is used for nodes that must remain the same during model training. A constant does not have parameters.

Placeholder

Placeholders are used to input training examples to your deep learning model. A placeholder can take parameters, and these parameters are changed at runtime as the model processes the training set.

Variable

Variables are used to add trainable nodes to the computation graph, such as weights and biases.

Related content: read our guide to deep convolutional neural networks.

Quick Tutorial: Building a Basic Convolutional Neural Network (CNN) in TensorFlow

This quick tutorial can help you get started implementing CNN in TensorFlow. It is based on the Fashion-MNIST dataset, containing 28 x 28 grayscale images of 65,000 fashion products in 10 categories. There are 55,000 images in the training set and 10,000 images in the test set. Our code is based on the full tutorial by Aditya Sharma.

1. Loading Data

First import all the necessary modules: NumPy, matplotlib and Tensorflow, then import the Fashion-MNIST data as follows:

# Use this for reading the data/fashion directory from the dataset
data = input_data.read_data_sets('data/fashion',one_hot=True,\
# Use this for retrieving Fashion-MNIST dataset from Amazon S3 bucket source_url='http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/')

2. CNN Architecture

We will use three convolutional layers, progressively adding more filters. All the filters are 3×3:

  1. Layer with hav 32 filters
  2. Layer with 64 filters
  3. Layer with 128 filters

In addition, we’ll have three max-pooling layers in between the convolutions, which are 2×2.

We’ll set basic hyperparameters of the CNN model:

training_iters = 10
learning_rate = 0.001
batch_size = 128

This batch size spec tells TensorFlow to train a specified number of images, and do this for every batch.

3. Neural Network Parameters

The number of inputs to the CNN is 784, because the images have 784 pixels and are read as a 784 dimensional vector. We will rebuild this vector into a matrix of 28 x 28 x 1.

# Use this to specify 28 inputs, and 10 classes for the predicted label at the end
n_input = 28
n_classes = 10

Here we define an input placeholder x with dimensionality None x 784, and output placeholder size of None x 10. Similarly, we’ll define a placeholder y for the label of the training images, which will be a None x 10 matrix.

We are setting the “row” to None because we previously defined batch_size, meaning placeholders receive the row size when the training set is loaded. Row size will be set to 128, like the batch_size.

# x is the input placeholder, rebuilding the image into 28x28x1 matrix
x = tf.placeholder("float", [None, 28,28,1])
# y is the label set, using the number of classes
y = tf.placeholder("float", [None, n_classes])

4. Wrapper Functions

Because we have several layers of the same type in the model, it’s useful to create a wrapper function for each type of layer, to avoid duplicating code. You can get functions like this out of the box with Keras, which is included with Tensorflow. However, in this tutorial we show you how to do things from scratch in TensorFlow without Keras helper functions.

Here is a function creating a 2-dimensional convolutional layer, with bias and Relu activation. The arguments are the test images x, weights W, bias b, and number of strides, meaning how quickly the filter moves over the image during the convolution.

def conv2d(x, W, b, strides=1):
   x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
   x = tf.nn.bias_add(x, b)
   return tf.nn.relu(x)

Here is another function creating a 2D max-pool layer. Here the parameters are test images x, and k, specifying the kernel/filter size.

def maxpool2d(x, k=2):

   return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],padding='SAME')

Now let’s define weights and biases.

weights = {
   'wc1': tf.get_variable('W0', shape=(3,3,1,32), initializer=tf.contrib.layers.xavier_initializer()),
   'wc2': tf.get_variable('W1', shape=(3,3,32,64), initializer=tf.contrib.layers.xavier_initializer()),
   'wc3': tf.get_variable('W2', shape=(3,3,64,128), initializer=tf.contrib.layers.xavier_initializer()),
   'wd1': tf.get_variable('W3', shape=(4*4*128,128), initializer=tf.contrib.layers.xavier_initializer()),
   'out': tf.get_variable('W6', shape=(128,n_classes), initializer=tf.contrib.layers.xavier_initializer()),
}
biases = {
   'bc1': tf.get_variable('B0', shape=(32), initializer=tf.contrib.layers.xavier_initializer()),
   'bc2': tf.get_variable('B1', shape=(64), initializer=tf.contrib.layers.xavier_initializer()),
   'bc3': tf.get_variable('B2', shape=(128), initializer=tf.contrib.layers.xavier_initializer()),
   'bd1': tf.get_variable('B3', shape=(128), initializer=tf.contrib.layers.xavier_initializer()),
   'out': tf.get_variable('B4', shape=(10), initializer=tf.contrib.layers.xavier_initializer()),

5. Building the CNN

Now we build the CNN by feeding the weights and biases into the wrapper functions.

def conv_net(x, weights, biases):  

   # This constructs the first convolutional layer with 32 3×3 filters and 32 biases. The next specifies the max-pool layer with the kernel size set to 2.

   conv1 = conv2d(x, weights['wc1'], biases['bc1'])
   conv1 = maxpool2d(conv1, k=2)

   # Use this to construct the second convolutional layer with 64 3×3 filters and 64 biases, and to another max-pool layer.

   conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
   conv2 = maxpool2d(conv2, k=2)

   # This helps you construct the third convolutional layer with 128 3×3 filters and 128 biases, and add the last max-pool layer.

   conv3 = conv2d(conv2, weights['wc3'], biases['bc3'])
   conv3 = maxpool2d(conv3, k=2)

   # Now you need to build the fully connected layer that will generate prediction labels. To do this, use reshape() to adapt the output of pooling to the input expected by the fully connected layer.

   fc1 = tf.reshape(conv3,
                    [-1,
                    weights['wd1'].get_shape().as_list()[0]])
   fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])

   # In this last part, apply the Relu function and perform matrix multiplication on the weights

   fc1 = tf.nn.relu(fc1)
   out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
   return out

6. Loss and Optimizer Nodes

First build the model using the conv_net() function we showed above. Pass in the following:

x, weights, and biases.
pred = conv_net(x, weights, biases)

This is a multi-class classification problem, so we will use the softmax activation function, which gives a probability between 0 and 1 for each class label (the label with the highest probability will be the prediction of the model). We’ll use cross-entropy as the loss function.

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))

Finally, we’ll define the Adam optimizer with a learning rate of 0.001 as defined in the model hyperparameters above:

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

7. Evaluate the Model

To test the model, we first initialize weights and biases, and then define a correct_prediction and accuracy node that will evaluate model performance every time it is run.

init = tf.global_variables_initializer()
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Now you can start the computation graph, and run a training session as follows:

  • Create For loops that define the number of training iterations as specified above
  • Create an inner For loop to specify the number of batches we specified above
  • Pass training images and labels using variables batch_x and batch_y
  • Define x and y placeholders to hold parameters the training images
  • After each training iteration, run the loss function and check training accuracy
  • After running through all the images, test accuracy by processing the 10,000 test images

See the original tutorial for the complete code that you can use to run the CNN model.

TensorFlow CNN in Production with Run:AI

Run:AI automates resource management and workload orchestration for deep learning infrastructure. With Run:AI, you can automatically run as many CNN experiments as needed in TensorFlow and other deep learning frameworks.

Here are some of the capabilities you gain when using Run:AI:

  • Advanced visibility—create an efficient pipeline of resource sharing by pooling GPU compute resources.
  • No more bottlenecks—you can set up guaranteed quotas of GPU resources, to avoid bottlenecks and optimize billing.
  • A higher level of control—Run:AI enables you to dynamically change resource allocation, ensuring each job gets the resources it needs at any given time

Run:AI simplifies machine learning infrastructure pipelines, helping data scientists accelerate their productivity and the quality of their models.

Learn more about the Run:AI GPU virtualization platform.