All Articles

TensorFlow 2 and Keras - Quick Start Guide

TL;DR Learn how to use Tensors, build a Linear Regression model and a simple Neural Network

TensorFlow 2.0 (final) was released at the end of September. Oh boy, it looks much cooler than the 1.x series. Why is it so much better for you, the developer?

  • One high-level API for building models (that you know and love) - Keras. The good news is that most of your old Keras code should work automagically after changing a couple of imports.
  • Eager execution - all your code looks much more like normal Python programs. Old-timers might remember the horrible Session experiences. You shouldn’t need any of that, in day-to-day use.

There are tons of other improvements, but the new developer experience is something that will make using TensorFlow 2 sweeter. What about PyTorch? PyTorch is still great and easy to use. But it seems like TensorFlow is catching up, or is it?

You’ll learn:

  • How to install TensorFlow 2
  • What is a Tensor
  • Doing Tensor math
  • Using probability distributions and sampling
  • Build a Simple Linear Regression model
  • Build a Simple Neural Network model
  • Save/restore a model

Run the complete code in your browser

Setup

Let’s install the GPU-supported version and set up the environment:

!pip install tensorflow-gpu

Check the installed version:

import tensorflow as tf

tf.__version__
2.0.0

And specify a random seed, so our results are reproducible:

RANDOM_SEED = 42

tf.random.set_seed(RANDOM_SEED)

Tensors

TensorFlow allows you to define and run operations on Tensors. Tensors are data-containers that can be of arbitrary dimension - scalars, vectors, matrices, etc. You can put numbers (floats and ints) and strings into Tensors.

Let’s create a simple Tensor:

x = tf.constant(1)
print(x)
tf.Tensor(1, shape=(), dtype=int32)

It seems like our first Tensor contains the number 1, it is of type int32 and is shapeless. To obtain the value we can do:

x.numpy()
1

Let’s create a simple matrix:

m = tf.constant([[1, 2, 1], [3, 4, 2]])
print(m)
tf.Tensor(
[[1 2 1]
 [3 4 2]], shape=(2, 3), dtype=int32)

This shape thingy seems to specify rows x columns. In general, the shape array shows how many elements are in every dimension of the Tensor.

Helpers

TensorFlow offers a variety of helper functions for creating Tensors. Let’s create a matrix full of ones:

ones = tf.ones([3, 3])
print(ones)
tf.Tensor(
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]], shape=(3, 3), dtype=float32)

and zeros:

zeros = tf.zeros([2, 3])
print(zeros)
tf.Tensor(
[[0. 0. 0.]
 [0. 0. 0.]], shape=(2, 3), dtype=float32)

We have two rows and three columns. What if we want to turn it into three rows and two columns:

tf.reshape(zeros, [3, 2])
tf.Tensor(
[[0. 0.]
 [0. 0.]
 [0. 0.]], shape=(3, 2), dtype=float32)

You can use another helper function to replace rows and columns (transpose):

tf.transpose(zeros)
tf.Tensor(
[[0. 0.]
 [0. 0.]
 [0. 0.]], shape=(3, 2), dtype=float32)

Tensor Math

Naturally, you would want to do something with your data. Let’s start with adding numbers:

a = tf.constant(1)
b = tf.constant(1)

tf.add(a, b).numpy()
42

That seems reasonable :) You can do the same thing using something more human friendly:

(a + b).numpy()

You can multiply Tensors like so:

c = a + b
c * c

And compute dot product of matrices:

d1 = tf.constant([[1, 2], [1, 2]]);
d2 = tf.constant([[3, 4], [3, 4]]);

tf.tensordot(d1, d2, axes=1).numpy()
array([[ 9, 12],
       [ 9, 12]], dtype=int32)

Sampling

You can also generate random numbers according to some famous probability distributions. Let’s start with Normal:

norm = tf.random.normal(shape=(1000, 1), mean=0., stddev=1.)

We can do the same thing from the Uniform:

unif = tf.random.uniform(shape=(1000, 1), minval=0, maxval=100)

Let’s have a look at something a tad more exotic - the Poisson distribution. It is popular for modeling the number of times an event occurs in some time. It is the first one (in our exploration) that contains a hyperparameter - λ\lambda. It controls the number of expected occurrences.

pois = tf.random.poisson(shape=(1000, 1), lam=0.8)

The Gamma distribution is continuous. It has 2 hyperparameters that control the shape and scale. It is used to model always positive continuous variables with skewed distributions.

gam = tf.random.gamma(shape=(1000, 1), alpha=0.8)

Simple Linear Regression Model

Let’s build a Simple Linear Regression model to predict the stopping distance of cars based on their speed. The data comes from here: https://vincentarelbundock.github.io/Rdatasets/datasets.html. It is given by this Tensor:

data = tf.constant([
  [4,2],
  [4,10],
  [7,4],
  [7,22],
  [8,16],
  [9,10],
  [10,18],
  [10,26],
  [10,34],
  [11,17],
  [11,28],
  [12,14],
  [12,20],
  [12,24],
  [12,28],
  [13,26],
  [13,34],
  [13,34],
  [13,46],
  [14,26],
  [14,36],
  [14,60],
  [14,80],
  [15,20],
  [15,26],
  [15,54],
  [16,32],
  [16,40],
  [17,32],
  [17,40],
  [17,50],
  [18,42],
  [18,56],
  [18,76],
  [18,84],
  [19,36],
  [19,46],
  [19,68],
  [20,32],
  [20,48],
  [20,52],
  [20,56],
  [20,64],
  [22,66],
  [23,54],
  [24,70],
  [24,92],
  [24,93],
  [24,120],
  [25,85]
])

We can extract the two columns using slicing:

speed = data[:, 0]
stopping_distance = data[:, 1]

Let’s have a look at the data:

It seems like a linear model can do a decent job of predicting the stopping distance. Simple Linear Regression finds a straight line that predicts the variable of interest based on a single predictor/feature.

Time to build the model using the Keras API:

lin_reg = keras.Sequential([
  layers.Dense(1, activation='linear', input_shape=[1]),
])

optimizer = tf.keras.optimizers.RMSprop(0.001)

lin_reg.compile(
  loss='mse',
  optimizer=optimizer,
  metrics=['mse']
)

We’re using the Sequential API with a single layer - 1 parameter with linear activation. We’ll try to minimize the Mean squared error during training.

And for the training itself:

history = lin_reg.fit(
  x=speed,
  y=stopping_distance,
  shuffle=True,
  epochs=1000,
  validation_split=0.2,
  verbose=0
)

We’re breaking any ordering issues by shuffling the data and reserving 20% for validation. Let’s have a look at the training process:

The model is steadily improving during training. That’s a good sign. What can we do with a more complex model?

Simple Neural Network Model

Keras (and TensorFlow) was designed as a tool to build Neural Networks. Turns out, Neural Networks are good when a linear model isn’t enough. Let’s create one:

def build_neural_net():
  net = keras.Sequential([
    layers.Dense(32, activation='relu', input_shape=[1]),
    layers.Dense(16, activation='relu'),
    layers.Dense(1),
  ])

  optimizer = tf.keras.optimizers.RMSprop(0.001)

  net.compile(loss='mse',
                optimizer=optimizer,
                metrics=['mse', 'accuracy'])

  return net

Things look similar, except for the fact that we stack multiple layers on top of each other. We’re also using a different activation function - ReLU.

Training this model looks exactly the same:

net = build_neural_net()

history = net.fit(
  x=speed,
  y=stopping_distance,
  shuffle=True,
  epochs=1000,
  validation_split=0.2,
  verbose=0
)

Seems like we ain’t making much progress after epoch 200 or so. Can we not waste our time waiting for the whole training to complete?

Early Stopping

Sure, you can stop the training process manually at say epoch 200. But what if you train another model? What if you obtain more data?

You can use the built-in callback EarlyStopping to halt the training when some metric (e.g. the validation loss) stops improving. Let’s see how we can use it:

early_stop = keras.callbacks.EarlyStopping(
  monitor='val_loss',
  patience=10
)

We want to monitor the validation loss. We’ll observe for improvement for 10 epochs before stopping. Let’s see how we can use it:

net = build_neural_net()

history = net.fit(
  x=speed,
  y=stopping_distance,
  shuffle=True,
  epochs=1000,
  validation_split=0.2,
  verbose=0,
  callbacks=[early_stop]
)

Effectively, we’ve cut down the number of training epochs to ~120. Is this going to work every time that well? Not really. Using early stopping introduces yet another hyperparameter that you need to consider when training your model. Use it cautiously.

Now your model is ready for the real world. How can you store it for later use?

Save/Restore Model

You can save the complete model (including weights) like this:

net.save('simple_net.h5')

And load it like that:

simple_net = keras.models.load_model('simple_net.h5')

You can use this mechanism to deploy your model and use it in production (for example).

Conclusion

You did it! You now know (a tiny bit) TensorFlow 2! Let’s recap what you’ve learned:

  • How to install TensorFlow 2
  • What is a Tensor
  • Doing Tensor math
  • Using probability distributions and sampling
  • Build a Simple Linear Regression model
  • Build a Simple Neural Network model
  • Save/restore a model

Run the complete code in your browser

Stay tuned for more :)

References