SemanticScuttle - klotz.me » klotz: deep learning+llm+transformers

klotz: deep learning* + llm* + transformers*

Understanding Attention in LLMs

The attention mechanism in Large Language Models (LLMs) helps derive the meaning of a word from its context. This involves encoding words as multi-dimensional vectors, calculating query and key vectors, and using attention weights to adjust the embedding based on contextual relevance.

2025-03-07 Tags: attention, llm, machine-learning, neural networks, nlp, transformers by klotz
Deep Dive into Self-Attention by Hand✍︎

Explore the intricacies of the attention mechanism responsible for fueling the transformers.

2025-02-04 Tags: transformers, self-attention, neural networks, llm, machine learning by klotz
New Trends in LLM Architecture

Discusses the trends in Large Language Models (LLMs) architecture, including the rise of more GPU, more weights, more tokens, energy-efficient implementations, the role of LLM routers, and the need for better evaluation metrics, faster fine-tuning, and self-tuning.

2024-06-01 Tags: llm, machine learning, deep learning, transformers, self-tuning, evaluation by klotz
Mastering LLM Techniques: Training

Delving into transformer networks

2023-11-18 Tags: nvidia, llm, training, transformers, deep learning by klotz

First / Previous / Next / Last / Page 1 of 0