Friday, 15 January 2021

Hierarchical clustering algorithms

 Hierarchical clustering algorithms group similar objects into groups called clusters. There are two types of hierarchical clustering algorithms:

Image for post

Some pros and cons of Hierarchical Clustering

Pros

Cons

How it works

Image for post
Image for post
Image for post

Dendrograms

Image for post

Linkage Criteria

Image for post
Image for post

Single Linkage

Image for post

Complete Linkage

Image for post

Average Linkage

Image for post

Ward Linkage

Image for post

Distance Metric

Euclidean Distance

Image for post

Manhattan Distance

Image for post

Example in python

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from sklearn.cluster import AgglomerativeClustering
import scipy.cluster.hierarchy as sch
Image for post
dataset = pd.read_csv('./data.csv')

X = dataset.iloc[:, [3, 4]].values
dendrogram = sch.dendrogram(sch.linkage(X, method='ward'))
Image for post
model = AgglomerativeClustering(n_clusters=5, affinity='euclidean', linkage='ward')
model.fit(X)
labels = model.labels_
Image for post
plt.scatter(X[labels==0, 0], X[labels==0, 1], s=50, marker='o', color='red')
plt.scatter(X[labels==1, 0], X[labels==1, 1], s=50, marker='o', color='blue')
plt.scatter(X[labels==2, 0], X[labels==2, 1], s=50, marker='o', color='green')
plt.scatter(X[labels==3, 0], X[labels==3, 1], s=50, marker='o', color='purple')
plt.scatter(X[labels==4, 0], X[labels==4, 1], s=50, marker='o', color='orange')
plt.show()
Image for post

No comments:

Post a Comment