Tags: huggingface* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. HuggingFace has released FineWeb, a new large-scale dataset consisting of 15 trillion tokens and 44TB of disk space designed for pretraining large language models (LLMs). The dataset, which leverages data from CommonCrawl, undergoes rigorous deduplication and quality filtering processes, making it a valuable tool for researchers.
    2024-06-04 Tags: , , , , by klotz
  2. This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.
  3. This model was built using a new Smaug recipe for improving performance on real world multi-turn conversations applied to meta-llama/Meta-Llama-3-70B-Instruct.

    The model outperforms Llama-3-70B-Instruct substantially, and is on par with GPT-4-Turbo, on MT-Bench (see below).
    2024-05-21 Tags: , , , , , , by klotz
  4. python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I love you'))"
  5. Not Mixtral MoE but Merge-kit MoE

    EveryoneLLM series of models are a new Mixtral type model created using experts that were finetuned by the community, for the community. This is the first model to release in the series and it is a coding specific model. EveryoneLLM, which will be a more generalized model, will be released in the near future after more work is done to fine tune the process of merging Mistral models into a larger Mixtral models with greater success.

    The goal of the EveryoneLLM series of models is to be a replacement or an alternative to Mixtral-8x7b that is more suitable for general and specific use, as well as easier to fine tune. Since Mistralai is being secretive about the "secret sause" that makes Mixtral-Instruct such an effective fine tune of the Mixtral-base model, I've decided its time for the community to directly compete with Mistralai on our own.
  6. - create a custom base image for a Cloud Workstation environment using a Dockerfile
    . Uses:

    Quantized models from
  7. A deep dive into model quantization with GGUF and llama.cpp and model evaluation with LlamaIndex

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "huggingface+llm"

About - Propulsed by SemanticScuttle