SemanticScuttle - klotz.me » klotz: clustering

klotz: clustering*

Bookmarks on this page are managed by an admin user.

Stop Using Elbow Method in K-means Clustering, Instead, Use this! | by Anmol Tomar | Towards Data Science This bookmark is certified by an admin user.

Elbow curve and Silhouette plots both are very useful techniques for finding the optimal K for K-means clustering

2023-02-13 Tags: elbow, silhouette, optimization, k for k-means, clustering, machine learning, python by klotz

Stanislas Morbieu – Accuracy: from classification to clustering evaluation This bookmark is certified by an admin user.

2019-06-06 Tags: clustering, classification, accuracy, machine learning by klotz

Spectral Clustering for beginners - Towards Data Science This bookmark is certified by an admin user.

2019-07-15 Tags: clustering, spectral clustering by klotz

Scikit Learn - Clustering Methods This bookmark is certified by an admin user.

Comparing Clustering Algorithms
Following table will give a comparison (based on parameters, scalability and metric) of the clustering algorithms in scikit-learn.

Sr.No Algorithm Name Parameters Scalability Metric Used
1 K-Means No. of clusters Very large n_samples The distance between points.
2 Affinity Propagation Damping It’s not scalable with n_samples Graph Distance
3 Mean-Shift Bandwidth It’s not scalable with n_samples. The distance between points.
4 Spectral Clustering No.of clusters Medium level of scalability with n_samples. Small level of scalability with n_clusters. Graph Distance
5 Hierarchical Clustering Distance threshold or No.of clusters Large n_samples Large n_clusters The distance between points.
6 DBSCAN Size of neighborhood Very large n_samples and medium n_clusters. Nearest point distance
7 OPTICS Minimum cluster membership Very large n_samples and large n_clusters. The distance between points.
8 BIRCH Threshold, Branching factor Large n_samples Large n_clusters The Euclidean distance between points.

2021-10-29 Tags: machine learning, clustering, scikit-learn, python, tutorial, cheatsheet by klotz

Powerful machine-learning technique enables biologists to analyze enormous data sets This bookmark is certified by an admin user.

2019-03-18 Tags: umap, clustering, dimensionality, deep learning, machine learning, biology by klotz

OpenAI Embeddings API Benchmarking This bookmark is certified by an admin user.

2022-01-31 Tags: openai, api, word embedding, embedding, nlp, use cases, similarity, clustering, topic modellinf by klotz

NLP Lecture 7 - Lexical Semantics and Word Embeddings This bookmark is certified by an admin user.

Word embeddings are suitable for use with neural network language models (as will be discussed later); they can also be used to enhance conventional (MEMM, CRF) models. The best ways to incorporate embeddings into such feature-based language models are still being explored. The simplest approach involves the direct use of the vector components as features (Turian et al 2010, Word Representations: A Simple and General Method for Semi-Supervised Learning, ACL 2010; Nguyen and Grishman, ACL 2014). Less direct approaches include building clusters from the embeddings and then using the clusters as features, or selecting prototypical examples of each type and then using similarity to these prototypes (based on embedding similarity) as features. Early results on NE tagging indicate a small advantage for the indirect methods (Guo et al., Revisiting embedding features for simple semi-supervised learning, EMNLP 2014). Models based on word embeddings are producing the best performance on named entity recognition (A. Passos et al, Lexicon Infused Phrase Embeddings for Named Entity Resolution, CoNLL 2014) and are effective for chunking (Turian et al ACL 2010).

2016-04-01 Tags: nlp, word embedding, word2vec, wordnet, clustering by klotz

Mastering Customer Segmentation with LLM This bookmark is certified by an admin user.

Unlock advanced customer segmentation techniques using LLMs, and improve your clustering models with advanced techniques