The result of the cluster analysis is shown by a dendrogram, which starts with all the data points as a separate cluster and indicates at what level of dissimilarity any two clusters were joined.

As you saw, the y-axis of the dendrogram is some measure of the dissimilarity or distance at which clusters join.

In the dendrogram shown above, samples 4 and 5 are the most similar and join to form the first cluster, followed by samples 1 and 10. The last two clusters to fuse together to form the final single cluster are 3-6 and 4-5-2-7-1-10-9-8.

Determining the number of groups in a cluster analysis is often the primary goal. Typically, one looks for natural groupings defined by long stems. Here, by observation, you can identify that there are 3 major groupings: 3-6, 4-5-2-7 and 1-10-9-8.

You also saw that hierarchical clustering can proceed in 2 ways — agglomerative and divisive. If you start with n distinct clusters and iteratively reach to a point where you have only 1 cluster in the end, it is called agglomerative clustering. On the other hand, if you start with 1 big cluster and subsequently keep on partitioning this cluster to reach n clusters, each containing 1 element, it is called divisive clustering.

## Additional Reference

You can read more about divisive clustering here and here.

**Comprehension – Hierarchical Clustering Algorithm**

Given below are five data points having two attributes x and y:

Observation | X | Y |

1 | 3 | 2 |

2 | 3 | 5 |

3 | 5 | 3 |

4 | 6 | 4 |

5 | 6 | 7 |

The distance matrix of the points, indicating the Euclidean distance between points, is as follows:

Label | 1 | 2 | 3 | 4 | 5 |

1 | 0.00 | 3.00 | 2.24 | 3.61 | 5.83 |

2 | 3.00 | 0.00 | 2.83 | 3.16 | 3.61 |

3 | 2.24 | 2.83 | 0.00 | 1.41 | 4.12 |

4 | 3.61 | 3.16 | 1.41 | 0.00 | 3.00 |

5 | 5.83 | 3.61 | 4.12 | 3.00 | 0.00 |

Take the distance between two clusters as the minimum distance between the points in the two clusters. Based on this information, answer the following questions.