SemanticScuttle - klotz.me

Retrieving and Filtering with Metadata This bookmark is certified by an admin user.

In this tutorial, we will build a RAG system with a self-querying retriever in the LangChain framework. This will enable us to filter the retrieved movies using metadata, thus providing more meaningful movie recommendations.

2024-05-28 Tags: recommender systems, llm, langchain, rag, pinecone, python by klotz

Combining the Best of Both Worlds: Retrieval-Augmented Generation for Knowledge-Intensive Natural Language Processing This bookmark is certified by an admin user.

This article discusses Retrieval-Augmented Generation (RAG) models, a new approach that addresses the limitations of traditional models in knowledge-intensive Natural Language Processing (NLP) tasks. RAG models combine parametric memory from pre-trained seq2seq models with non-parametric memory from a dense vector index of Wikipedia, enabling dynamic knowledge access and integration.

2024-05-28 Tags: retrieval-augmented generation, nlp, llm, parametric memory by klotz

Training and Finetuning Embedding Models with Sentence Transformers v3 This bookmark is certified by an admin user.

This article explains how to use the Sentence Transformers library to finetune and train embedding models for a variety of applications, such as retrieval augmented generation, semantic search, and semantic textual similarity. It covers the training components, dataset format, loss function, training arguments, evaluators, and trainer.

2024-05-28 Tags: sentence transformers, finetune, embedding, models, similarity, llm, huggingface by klotz

A Step-by-Step Guide to Representation Finetuning LLAMA3 This bookmark is certified by an admin user.

"The paper introduces a technique called LoReFT (Low-rank Linear Subspace ReFT). Similar to LoRA (Low Rank Adaptation), it uses low-rank approximations to intervene on hidden representations. It shows that linear subspaces contain rich semantics that can be manipulated to steer model behaviors."

2024-05-26 Tags: linear subspace, lora, representation, fine tuning, reft, stanford, nlp, python, llm by klotz

Building a simple Agent with Tools and Toolkits in LangChain This bookmark is certified by an admin user.

This article guides you through the process of building a simple agent in LangChain using Tools and Toolkits. It explains the basics of Agents, their components, and how to build a Mathematics Agent that can perform simple mathematical operations.

2024-05-26 Tags: langchain, agents, tools, llm, python by klotz

Quadratic - The Infinite Canvas Spreadsheet with Code This bookmark is certified by an admin user.

Quadratic is a modern spreadsheet that combines the familiarity of a spreadsheet with the power of code, allowing you to work with data and code collaboratively in real-time. It supports popular programming languages like Python, SQL, and JavaScript, and offers features such as dynamic charts, APIs, multi-line formulas, and AI integration.

2024-05-26 Tags: quadratic, spreadsheet, data analysis, code, python, sql, llm, collaboration, data visualization by klotz

Open-Source LLMs as Agents This bookmark is certified by an admin user.

The article discusses the use of large language models (LLMs) as reasoning engines for powering agent workflows, focusing specifically on ReAct agents. It explains how these agents combine reasoning and action capabilities and provides examples of how they function. Challenges faced while implementing such agents are also mentioned, along with ways to overcome them. Additionally, the integration of open-source models within LangChain is highlighted.

2024-05-26 Tags: llm, agents, react, foss, langchain, large+language+models+(llms), react+agents, mixtral, llama2, openhermes, zephyr by klotz

Deep Dive into LlaMA 3 by Hand This bookmark is certified by an admin user.

This article explores the transformer architecture behind Llama 3, a large language model released by Meta, and discusses how to leverage its power for enterprise and grassroots level use. It also delves into the technical details of LlaMA 3 and its prospects for the GenAI ecosystem.

2024-05-26 Tags: mlops, fmops, llm, llama-3 by klotz

Generate better prompts in the developer console with Anthropic's new feature This bookmark is certified by an admin user.

Anthropic has introduced a new feature in their Console that allows users to generate production-ready prompt templates using AI. This feature employs prompt engineering techniques such as chain-of-thought reasoning, role setting, and clear variable delineation to create effective and precise prompts. It helps both new and experienced prompt engineers save time and often produces better results than hand-written prompts. The generated prompts are also editable for optimal performance.

2024-05-26 Tags: anthropic, prompt engineering, claude, cli, chain-of-thought, role, llm by klotz

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention This bookmark is certified by an admin user.

This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.

2024-05-26 Tags: transformer, autoregressive language models, key-value cache, attention, multiquery attention, cross-layer attention, machine learning, computer science, llm, mit, csail by klotz

SemanticScuttle - klotz.me

Tags: llm*

Linked Tags

Related Tags