In this segment, you will learn how to build neural networks using Keras, a deep learning library that you will primarily be using throughout this course.
Keras is a high-level library designed to work on top of Theano or TensorFlow. The main advantage of using Keras is that it is an easy-to-use, minimalistic API that can be used to build and deploy deep learning models quickly. Due to its simplicity, Keras’s syntax and model building pipeline is easy to learn for beginners (although it does not compromise on flexibility, you can do almost everything with Keras that you do with pure TensorFlow or Theano).
You will be surprised to see how only a few lines of Python code in Keras can build and train complex, deep neural networks. In this segment, you will learn about the typical model-building process in Keras.
Building neural networks in Keras
Download the notebook file attached below. It contains the code that will be used in this segment.
Keras Installation
To install Keras, open your Jupyter Notebook and implement the following command:!pip install keras
To check the version of Keras on your system, execute the following command (Note: Do not forget to import Keras.)print(keras.__version__)
Note that you need to have either Theano or TensorFlow on your system in order to use Keras. So, you should install either of the two before importing Keras; otherwise, it might throw an error. So, you first need to install TensorFlow if not already done. For this, you can execute the following command:!pip install tensorflow
Keras by default uses TensorFlow as the backend. If you wish to use Keras with Theano as the backend, you need to find the ‘keras.json’ file and change “backend”: “tensorflow” to “backend”:”theano” in the .json file.
Now that you know how to install Keras and TensorFlow, let’s take a look at the steps implemented by Keras for building a model.
There are six main steps involved in building a model using Keras:
1. Load the data:
We load the data available using the following function as defined in the notebook:load_data()
Note that the shape of the matrices changes in the Keras implementation with respect to the Numpy implementation. The following lines of code take the transpose of each of the matrices read by load_data() function and assign the matrices to the corresponding variables:train_set_x = train_set_x.T train_set_y = train_set_y.T test_set_x = test_set_x.T test_set_y = test_set_y.T
2. Define the model:
Typically, models in Keras are defined as a sequence of layers. So, we first need to create a model, ‘nn_model’, as follows: nn_model = Sequential()
This model has no layers as of now. We can add as many layers as we want to it. Let’s add the first hidden layer. As we add a layer, we also need to specify the number of neurons in the layer and the activation function they will use. You will observe that we use the function ‘Dense’, which specifies that the layers are fully connected, i.e., every neuron in one layer will be connected to every other neuron in the next layer. The line of code added within the model is as follows:nn_model.add(Dense(35, input_dim=784, activation=’relu’))
Here, as an example, ‘35’ denotes the number of neurons in the hidden layer and ‘784’ denotes the input size.
You can find more types of layers here.
3. Compile the model:
After we have defined the model, we need to compile the model. During this step, Keras uses the backend libraries to efficiently represent the above-described model for training and prediction. We need to specify the loss function, the metrics as well as the optimiser in this step. In a classification problem, the loss function is the cross-entropy loss, which is ‘categorical_crossentropy’ in Keras. The metrics and the optimiser we use are ‘accuracy’ and ‘adam’ (consider this as an optimiser similar to gradient descent, which helps you to find optimum values of your model’s parameters). The line of code added in the model for compilation is as follows: nn_model.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
4. Fit the model:
The next step is to fit the model on the data set available. You need to pass the training data present in the variables ‘train_set_x’ and ‘train_set_y’ that you created earlier. You also need to pass the number of epochs you want for the training to happen. Essentially, 1 epoch is 1 pass through the entire data set in mini-batches (we will discuss this in the subsequent segments). So, you also need to specify the mini-batch size (the number of data points to be sent through the neural network in one go).nn_model.fit(train_set_x, train_set_y, epochs=10, batch_size=32)
Note that unless we call the ‘model.fit()’ function, the training does not begin.
5. Evaluate the model:
In this step, we can see the accuracy scores that we finally achieved using the following command:scores_train = nn_model.evaluate(train_set_x, train_set_y) print(“\n%s: %.2f%%” % (nn_model.metrics_names[1], scores_train[1]*100))
To get the score on the test data, we can write the following code:scores_test = nn_model.evaluate(test_set_x, test_set_y) print(“\n%s: %.2f%%” % (nn_model.metrics_names[1], scores_test[1]*100))
Note that we only changed the data set from train to test.
6. Make predictions:
Predictions can be performed using the ‘.predict()’ function in the following manner:predictions = nn_model.predict(test_set_x)
These are the steps involved in building a model in Keras. You must have observed that we need not write any code for feedforward or backpropagation as we did when we implemented the neural network using TensorFlow. Using Keras eliminates all those efforts. Keras is used almost everywhere because it is quite flexible. You may want to refer to the complete documentation of Keras here.
Let’s proceed to the housing price prediction example for a demonstration on how to build and train a model. We will first perform data preprocessing of the housing data set before proceeding to model building using Keras.
Now that we have performed feature transformation on the data set, let’s watch the next video to learn how to create a model and train it using Keras.
In the videos above, we saw how the implementation of a training model we built earlier in TensorFlow required just a few lines of code using Keras. Quite interestingly, in the Keras model, we were passing 32 rows of input data as a batch for a training step, unlike what you have seen before. We do this instead of passing the whole data set through the training step because loading the whole data set while training the model will occupy a lot of memory and will slow down your training speed. Therefore, we split it into ‘mini-batches’ (usual size is 32) and pass each batch one by one through the training step. This is memory- and speed-efficient.
If the batch size is set to 1, then it becomes the stochastic gradient descent (which is shown in the TensorFlow code). If the batch size is set to the size of the data set, then it is called batch gradient descent (where we pass the whole data set through training in one go). If the batch size is a fraction of the whole data set, like 32 or 64, then it is mini-batch gradient descent.
The whole process can be summarised as follows:
- Define a simple sequential model to set the hidden layer(s) and the output layer. The following code allows us to define the simple sequential model:
model = keras.Sequential( [ keras.layers.Dense(2, activation=”sigmoid”, input_shape=(X.shape[-1],)), keras.layers.Dense(1, activation=”linear”) ] )
- Display the properties and dimensions of each layer of the neural network.
model.summary()
- Define the type of optimiser to update the weights and biases.
model.compile(optimizer=keras.optimizers.SGD(), loss=”mean_squared_error”)
- Fit all the components defined above into one line of code to train the neural network.
model.fit(X,Y.values,epochs=10,batch_size=32)
- Obtain predictions on different input data.
model.predict(X)[:,0]
Now, let’s see how we can use Keras for a problem on unstructured data, i.e., MNIST. We will use Keras to train an image classifier on it.
NOTE: The code can be run on Google Colab, jupyter notebook, or other compatible platforms. However, Google Colab is preferable as it requires a minimal initial setup.
Google Colab is a free cloud service and provides free GPU access. You can learn how to get started with Google Colab here.
The datasets can be downloaded below. (You can upload the zip file directly to Google Colab as the inbuilt codes are written in a notebook to extract the datasets.)
Now that we have explored the MNIST data set, let’s build a model, train it and finally get predictions on the input data set using the trained model.
A visualisation of the model architecture used for training the neural network for the MNIST data set is given below.
Now, you know how to use Keras to build and train neural networks. You may want to change the values of different hyperparameters in the model and analyse how it affects model performance.
In the next segment, we will discuss different attributes involved in building neural networks in Keras.
Report an error