Category: Clustering Algorithms In ML
-
Agglomerative Clustering
Agglomerative clustering is a hierarchical clustering algorithm that starts with each data point as its own cluster and iteratively merges the closest clusters until a stopping criterion is reached. It is a bottom-up approach that produces a dendrogram, which is a tree-like diagram that shows the hierarchical relationship between the clusters. The algorithm can be…
-
Distribution-Based Clustering
Distribution-based clustering algorithms, also known as probabilistic clustering algorithms, are a class of machine learning algorithms that assume that the data points are generated from a mixture of probability distributions. These algorithms aim to identify the underlying probability distributions that generate the data, and use this information to cluster the data into groups with similar…
-
Affinity Propagation
Affinity Propagation is a clustering algorithm that identifies “exemplars” in a dataset and assigns each data point to one of these exemplars. It is a type of clustering algorithm that does not require a pre-specified number of clusters, making it a useful tool for exploratory data analysis. Affinity Propagation was introduced by Frey and Dueck…
-
BIRCH Clustering
BIRCH (Balanced Iterative Reducing and Clustering hierarchies) is a hierarchical clustering algorithm that is designed to handle large datasets efficiently. The algorithm builds a treelike structure of clusters by recursively partitioning the data into subclusters until a stopping criterion is met. BIRCH uses two main data structures to represent the clusters: Clustering Feature (CF) and…
-
HDBSCAN Clustering
Working of HDBSCAN Clustering HDBSCAN builds a hierarchy of clusters using a mutual-reachability graph, which is a graph where each data point is a node and the edges between them are weighted by a measure of similarity or distance. The graph is built by connecting two points with an edge if their mutual reachability distance…
-
OPTICS Clustering
OPTICS is like DBSCAN (Density-Based Spatial Clustering of Applications with Noise), another popular density-based clustering algorithm. However, OPTICS has several advantages over DBSCAN, including the ability to identify clusters of varying densities, the ability to handle noise, and the ability to produce a hierarchical clustering structure. Implementation of OPTICS in Python To implement OPTICS clustering…
-
DBSCAN Clustering
The DBSCAN Clustering algorithm works as follows − Implementation in Python We can implement the DBSCAN algorithm in Python using the scikit-learn library. Here are the steps to do so − Load the dataset The first step is to load the dataset. We will use the make_moons function from the scikitlearn library to generate a toy dataset…
-
Density-Based Clustering
Density-based clustering is based on the idea that clusters are regions of high density separated by regions of low density. Popular Density-based Clustering Algorithms Here are the most common density-based clustering algorithms − DBSCAN Clustering The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is one of the most common density-based clustering algorithms. The DBSCAN algorithm…
-
Hierarchical Clustering
Hierarchical clustering is another unsupervised learning algorithm that is used to group together the unlabeled data points having similar characteristics. Hierarchical clustering algorithms falls into following two categories − Steps to Perform Agglomerative Hierarchical Clustering We are going to explain the most used and important Hierarchical clustering i.e. agglomerative. The steps to perform the same…
-
Mean-Shift Clustering
The Mean-Shift clustering algorithm is a non-parametric clustering algorithm that works by iteratively shifting the mean of a data point towards the densest area of the data. The densest area of the data is determined by the kernel function, which is a function that assigns weights to the data points based on their distance from…