In the previous segment, you started exploring the mathematical capabilities of TensorFlow. Now, in this segment, you will look at an important capability of TensorFlow, which might be helpful for ML tasks: Calculating gradients.
TensorFlow can find the gradients of equations automatically. Since almost all ML algorithms make use of gradients in the learning process, this ability plays a crucial role in building custom training algorithms. But before you learn about the auto-gradient, you need to understand another concept: Computational graphs.
A computational graph is a representation of the mathematical operations in a TensorFlow program. The forthcoming video will introduce you to the concept of computational graphs.
Tensors and computational graphs are the two fundamental concepts in TensorFlow. You know that tensors are the data structures that are used to store data in TensorFlow. In contrast, computational graphs are used to know the ‘flow’ through the different mathematical operations that a tensor will undergo. In a computational graph, the edges show the data (in the form of tensors), whereas the nodes show the mathematical operations that need to be performed on the tensors.
Computational graphs have a few benefits, which include the following:
- Visualisation: Computational graphs help with visualising algorithms, which in turn makes it easier to develop and maintain complex algorithms. This is especially helpful in the case of neural networks since neural network models are quite complex.
- Gradient calculations: An important ability of TensorFlow is to calculate gradients. Computational graphs are used to trace the dependencies of variables on each other. As you saw in the video, a path is traced from a dependent variable to an independent variable first, and then all the intermediate gradients are calculated and used to compute the expected gradient using the chain rule of differentiation.
- Distributed architecture: Computational graphs also help with distributing the training process on a cluster of machines. In one of the upcoming segments, you will learn how this distributed architecture works exactly.
Now, in the next video, you will learn about the GradientTape() functionality, which helps with gradient calculation in TensorFlow.
The gradient of any function can be calculated by following these steps:
- Initialise the independent variables. The dependent and independent variables need to be tf.Variable-type tensors for the gradient to work.
- Create a context of GradientTape() and record the equations that relate to the different variables inside the context.
- To find the derivative of an equation that is recorded in the gradient tape context, use .gradient() outside the context and pass in the variable to differentiate and the variable with respect to which the differentiation will occur.
GradientTape() internally calculates the value of the gradient at the given value of the independent variable. Gradient tape has a few other capabilities as well. You will learn about them in the next segment. For now, attempt these questions based on your learning from this segment.
Moving ahead, in the next segment, you will learn how to compute partial derivatives using GradientTape().
Report an error