We will use the same online retail case study and data set that we used for the K-Means algorithm. For making the customer segments this time, we will use the hierarchical algorithm.
We will start at the point where we are done with the data preparation and already have the RFM dataset which has been treated for missing values and outliers, and is also standardised.
The hierarchical clustering involves 2 basic steps:
- Creating the dendrogram.
- Cutting the dendrogram at an appropriate level.
Now let’s go ahead and utilise the single linkage method for clustering this dataset.
As you can clearly see, single linkage doesn’t produce a good enough result for us to analyse the clusters. Hence, we need to go ahead and utilise the complete linkage method and then analyse the clusters once again.
After we got the clusterIDs for each customer, we then appended the obtained ClusterIDs to the RFM data set, and analysed the characteristics of each cluster to derive the business insights from the different customer segments or clusters, in the same way as you did for the K-Means algorithm.
Now look at the following dendrogram and answer the questions that follow.