klotz: nlp* + text* + classification*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. A Github Gist containing a Python script for text classification using the TxTail API
  2. Zero-Shot Classification
    To perform zero shot classification, we want to predict labels for our samples without any training. To do this, we can simply embed short descriptions of each label, such as positive and negative, and then compare the cosine distance between embeddings of samples and label descriptions.

    As shown above, zero-shot classification with embeddings can lead to great results, especially when the labels are more descriptive than just simple words.

    The highest similarity label to the sample input is the predicted label. We can also define a prediction score to be the difference between the cosine distance to the positive and to the negative label. This score can be used for plotting a precision-recall curve, which can be used to select a different tradeoff between precision and recall, by selecting a different threshold.
  3. Each time you run the model, the results may vary a little bit. Overall, after 5 tries, I can conclude that SBERT has a bit better performance in terms of best f1 score while Data2vec used way less memory. The average f1 scores for both models are very close.
  4. A quick guide for fast exploration of NLP pipelines
  5. 2021-09-25 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: nlp + text + classification

About - Propulsed by SemanticScuttle