SemanticScuttle - klotz.me » Tags: clustering+nlp

Tags: clustering* + nlp*

0 bookmark(s) - Sort by: Date ↓ / Title /

OpenAI Embeddings and Clustering for Survey Analysis — A How-To Guide

A guide on how to use OpenAI embeddings and clustering techniques to analyze survey data and extract meaningful topics and actionable insights from the responses.

The process involves transforming textual survey responses into embeddings, grouping similar responses through clustering, and then identifying key themes or topics to aid in business improvement.

2024-10-26 Tags: embedding, clustering, survey analysis, data science, visualization, k-means, tsne by klotz

Mastering Customer Segmentation with LLM

Unlock advanced customer segmentation techniques using LLMs, and improve your clustering models with advanced techniques

2023-09-28 Tags: llm, segmentation, kmeans, k prototype, tsne, pca, sentence, embedding, clustering, exploratory data analysis by klotz

OpenAI Embeddings API Benchmarking

2022-01-31 Tags: openai, api, word embedding, embedding, nlp, use cases, similarity, clustering, topic modellinf by klotz

Topic Modeling with BERT. | Towards Data Science

2020-12-08 Tags: topic modeling, bert, classification, nlp, embedding, deep learning, tf-idf, clustering, marten grootendorst by klotz

Mapping the tech world with t-SNE - Towards Data Science

2020-01-16 Tags: t-sne, deep learning, nlp, text, clustering, dimensionality reduction, medium by klotz

Top 100 Films

ow can you learn about the underlying structure of documents in a way that is informative and intuitive? This basic motivating question led me on a journey to visualize and cluster documents in a two-dimensional space. What you see above is an output of an analytical pipeline that begin by gathering synopses on the top 100 films of all time and ended by analyzing the latent topics within each document. In between I ran significant manipulations on these synopses (tokenization, stemming), transformed them into a vector space model (tf-idf), and clustered them into groups (k-means). You can learn all about how I did this with my detailed guide to Document Clustering with Python. But first, what did I learn?

2016-06-02 Tags: lda, nlp, clustering, k-means, cosine similarity, imdb, movies, tf-idf by klotz

Document Clustering with Python

tokenizing and stemming each synopsis
transforming the corpus into vector space using tf-idf
calculating cosine distance between each document as a measure of similarity
clustering the documents using the k-means algorithm
using multidimensional scaling to reduce dimensionality within the corpus
plotting the clustering output using matplotlib and mpld3
conducting a hierarchical clustering on the corpus using Ward clustering
plotting a Ward dendrogram
topic modeling using Latent Dirichlet Allocation (LDA)

2018-08-16 Tags: lda, document, clustering, python, tf-idf, k-means, nlp, text by klotz

NLP Lecture 7 - Lexical Semantics and Word Embeddings

Word embeddings are suitable for use with neural network language models (as will be discussed later); they can also be used to enhance conventional (MEMM, CRF) models. The best ways to incorporate embeddings into such feature-based language models are still being explored. The simplest approach involves the direct use of the vector components as features (Turian et al 2010, Word Representations: A Simple and General Method for Semi-Supervised Learning, ACL 2010; Nguyen and Grishman, ACL 2014). Less direct approaches include building clusters from the embeddings and then using the clusters as features, or selecting prototypical examples of each type and then using similarity to these prototypes (based on embedding similarity) as features. Early results on NE tagging indicate a small advantage for the indirect methods (Guo et al., Revisiting embedding features for simple semi-supervised learning, EMNLP 2014). Models based on word embeddings are producing the best performance on named entity recognition (A. Passos et al, Lexicon Infused Phrase Embeddings for Named Entity Resolution, CoNLL 2014) and are effective for chunking (Turian et al ACL 2010).

2016-04-01 Tags: nlp, word embedding, word2vec, wordnet, clustering by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: clustering* + nlp*

Linked Tags

Related Tags