SemanticScuttle - klotz.me » klotz: nlp+text+classification

A Github Gist containing a Python script for text classification using the TxTail API

2024-07-13 Tags: gist, python, txtail, text classification, github, benchmark, llm, gpt, bert by klotz

openai-cookbook/Zero-shot_classification_with_embeddings.ipynb at main · openai/openai-cookbook · GitHub

Zero-Shot Classification
To perform zero shot classification, we want to predict labels for our samples without any training. To do this, we can simply embed short descriptions of each label, such as positive and negative, and then compare the cosine distance between embeddings of samples and label descriptions.

As shown above, zero-shot classification with embeddings can lead to great results, especially when the labels are more descriptive than just simple words.

The highest similarity label to the sample input is the predicted label. We can also define a prediction score to be the difference between the cosine distance to the positive and to the negative label. This score can be used for plotting a precision-recall curve, which can be used to select a different tradeoff between precision and recall, by selecting a different threshold.

2023-05-31 Tags: openai, classification, embedding, machine learning, zero-shot by klotz

Word embeddings | Text | TensorFlow

2022-11-10 Tags: embedding, doc2vec, tensorflow, word2vec, classification, neural network by klotz

Part E: Text Classification with an Embedding Layer in a Feed-Forward Network - Deep Learning Tutorials with Keras - Medium

2022-11-10 Tags: classification, doc2vec, embedding, neural network by klotz

Feature Extraction with BERT for Text Classification | by Marcello Politi | Jun, 2022 | Towards Data Science

2022-07-11 Tags: bert, classification, text, nlp, machine learning, deep learning, embedding by klotz

SBERT vs. Data2vec on Text Classification | by Jinhang Jiang | May, 2022 | Towards Data Science

Each time you run the model, the results may vary a little bit. Overall, after 5 tries, I can conclude that SBERT has a bit better performance in terms of best f1 score while Data2vec used way less memory. The average f1 scores for both models are very close.

2022-05-19 Tags: sbert, data2vec, text, classification, multi-label, nlp, machine learning, towardsdatascience by klotz

From raw text to model prediction in under 30 lines of Python

A quick guide for fast exploration of NLP pipelines

2022-04-12 Tags: python, nlp, text, classification, machine learning, atom by klotz

Clustering sentence embeddings to identify intents in short text | by David Borrelli | Oct, 2021 | Towards Data Science

2021-10-20 Tags: embedding, intent, umap, hdbscan, classification, machine learning by klotz

Zero-Shot Intent Classification with Siamese Networks | by Duygu ALTINOK | Sep, 2021 | Towards Data Science

2021-09-25 Tags: bert, intent, classification, deep learning by klotz

Learning Text Classification Using the fastText Library

2021-09-20 Tags: text, classification, fasttext, python, machine learning, nlp by klotz