Tags: llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. In this tutorial, we will build a RAG system with a self-querying retriever in the LangChain framework. This will enable us to filter the retrieved movies using metadata, thus providing more meaningful movie recommendations.
  2. This article discusses Retrieval-Augmented Generation (RAG) models, a new approach that addresses the limitations of traditional models in knowledge-intensive Natural Language Processing (NLP) tasks. RAG models combine parametric memory from pre-trained seq2seq models with non-parametric memory from a dense vector index of Wikipedia, enabling dynamic knowledge access and integration.
  3. This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.
  4. "The paper introduces a technique called LoReFT (Low-rank Linear Subspace ReFT). Similar to LoRA (Low Rank Adaptation), it uses low-rank approximations to intervene on hidden representations. It shows that linear subspaces contain rich semantics that can be manipulated to steer model behaviors."
  5. This article guides you through the process of building a simple agent in LangChain using Tools and Toolkits. It explains the basics of Agents, their components, and how to build a Mathematics Agent that can perform simple mathematical operations.
    2024-05-26 Tags: , , , , by klotz
  6. Quadratic is a modern spreadsheet that combines the familiarity of a spreadsheet with the power of code, allowing you to work with data and code collaboratively in real-time. It supports popular programming languages like Python, SQL, and JavaScript, and offers features such as dynamic charts, APIs, multi-line formulas, and AI integration.
  7. The article discusses the use of large language models (LLMs) as reasoning engines for powering agent workflows, focusing specifically on ReAct agents. It explains how these agents combine reasoning and action capabilities and provides examples of how they function. Challenges faced while implementing such agents are also mentioned, along with ways to overcome them. Additionally, the integration of open-source models within LangChain is highlighted.
  8. This article explores the transformer architecture behind Llama 3, a large language model released by Meta, and discusses how to leverage its power for enterprise and grassroots level use. It also delves into the technical details of LlaMA 3 and its prospects for the GenAI ecosystem.
    2024-05-26 Tags: , , , by klotz
  9. Anthropic has introduced a new feature in their Console that allows users to generate production-ready prompt templates using AI. This feature employs prompt engineering techniques such as chain-of-thought reasoning, role setting, and clear variable delineation to create effective and precise prompts. It helps both new and experienced prompt engineers save time and often produces better results than hand-written prompts. The generated prompts are also editable for optimal performance.
  10. This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.

Top of the page

First / Previous / Next / Last / Page 5 of 0 SemanticScuttle - klotz.me: tagged with "llm"

About - Propulsed by SemanticScuttle