Author: Tayyaba Syed
-
Affinity Propagation
Affinity Propagation is a clustering algorithm that identifies “exemplars” in a dataset and assigns each data point to one of these exemplars. It is a type of clustering algorithm that does not require a pre-specified number of clusters, making it a useful tool for exploratory data analysis. Affinity Propagation was introduced by Frey and Dueck…
-
BIRCH Clustering
BIRCH (Balanced Iterative Reducing and Clustering hierarchies) is a hierarchical clustering algorithm that is designed to handle large datasets efficiently. The algorithm builds a treelike structure of clusters by recursively partitioning the data into subclusters until a stopping criterion is met. BIRCH uses two main data structures to represent the clusters: Clustering Feature (CF) and…
-
HDBSCAN Clustering
Working of HDBSCAN Clustering HDBSCAN builds a hierarchy of clusters using a mutual-reachability graph, which is a graph where each data point is a node and the edges between them are weighted by a measure of similarity or distance. The graph is built by connecting two points with an edge if their mutual reachability distance…
-
OPTICS Clustering
OPTICS is like DBSCAN (Density-Based Spatial Clustering of Applications with Noise), another popular density-based clustering algorithm. However, OPTICS has several advantages over DBSCAN, including the ability to identify clusters of varying densities, the ability to handle noise, and the ability to produce a hierarchical clustering structure. Implementation of OPTICS in Python To implement OPTICS clustering…
-
DBSCAN Clustering
The DBSCAN Clustering algorithm works as follows − Implementation in Python We can implement the DBSCAN algorithm in Python using the scikit-learn library. Here are the steps to do so − Load the dataset The first step is to load the dataset. We will use the make_moons function from the scikitlearn library to generate a toy dataset…
-
Density-Based Clustering
Density-based clustering is based on the idea that clusters are regions of high density separated by regions of low density. Popular Density-based Clustering Algorithms Here are the most common density-based clustering algorithms − DBSCAN Clustering The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is one of the most common density-based clustering algorithms. The DBSCAN algorithm…
-
Hierarchical Clustering
Hierarchical clustering is another unsupervised learning algorithm that is used to group together the unlabeled data points having similar characteristics. Hierarchical clustering algorithms falls into following two categories − Steps to Perform Agglomerative Hierarchical Clustering We are going to explain the most used and important Hierarchical clustering i.e. agglomerative. The steps to perform the same…
-
Mean-Shift Clustering
The Mean-Shift clustering algorithm is a non-parametric clustering algorithm that works by iteratively shifting the mean of a data point towards the densest area of the data. The densest area of the data is determined by the kernel function, which is a function that assigns weights to the data points based on their distance from…
-
K-Medoids Clustering
K-Medoids Clustering – Algorithm The K-medoids clustering algorithm can be summarized as follows − Implementation in Python To implement K-medoids clustering in Python, we can use the scikit-learn library. The scikit-learn library provides the KMedoids class, which can be used to perform K-medoids clustering on a dataset. First, we need to import the required libraries − Next,…
-
K-Means Clustering
The K-Means algorithm can be summarized into the following steps − The K-Means algorithm is a straightforward and efficient algorithm, and it can handle large datasets. However, it has some limitations, such as its sensitivity to the initial centroids, its tendency to converge to local optima, and its assumption of equal variance for all clusters.…