SemanticScuttle - klotz.me » klotz: deepseek+deep learning

klotz: deepseek* + deep learning*

A detailed comparison of the architectures of recent large language models (LLMs) including DeepSeek-V3, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen3, SmolLM3, and Kimi 2, focusing on key design choices and their impact on performance and efficiency.

2025-07-19 Tags: llm, large language models, deep learning, ai, architecture, deepseek, olmo, gemma, mistral, llama, qwen, smollm, kimi, moe, attention, transformers by klotz

DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level

Details the development and release of DeepCoder-14B-Preview, a 14B parameter code reasoning model achieving performance comparable to o3-mini through reinforcement learning, along with the dataset, code, and system optimizations used in its creation.

2025-04-09 Tags: deepcoder, llm, reinforcement learning, coding, open source, deepseek, code interpreter by klotz

Deep-Diving & Decoding The Secrets That Make DeepSeek So Good

The article explores the architectural changes that enable DeepSeek's models to perform well with fewer resources, focusing on Multi-Head Latent Attention (MLA). It discusses the evolution of attention mechanisms, from Bahdanau to Transformer's Multi-Head Attention (MHA), and introduces Grouped-Query Attention (GQA) as a solution to MHA's memory inefficiencies. The article highlights DeepSeek's competitive performance despite lower reported training costs.

2025-02-16 Tags: deepseek, multi-head latent attention, mla, attention, transformer, grouped-query attention, gqa, deep learning, llm by klotz

The Math Behind DeepSeek-R1

DeepSeek-R1 is a groundbreaking AI model that uses reinforcement learning to teach large language models to reason, outperforming models like GPT4-o1 at a fraction of the computational cost.

2025-02-01 Tags: deepseek-r1, reinforcement learning, llm, machine learning, deepseek by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: deepseek* + deep learning*

Linked Tags

Related Tags