SemanticScuttle - klotz.me » klotz: mixture of experts

klotz: mixture of experts*

An Overview of Chinese Open-Source LLMs (Sept 2025)

By mid-2025 China had become a global leader in open-source large language models (LLMs). According to Chinese state media, by July 2025 China accounted for 1,509 of the world’s ~3,755 publicly released LLMs, far more than any other country. This explosion reflects heavy state and industry investment in domestic AI, open licensing (often Apache- or MIT-style), and a strategic pivot by Chinese tech giants and startups toward publicly shared models. The result is a "revival" of open-source AI, with dozens of Chinese LLMs now available for download or use via Hugging Face, GitHub, or cloud APIs. These range from general-purpose foundation models dozens of billions of parameters in size to specialized chatbots and domain experts, many built on Mixture-of-Experts (MoE) architectures.

2025-11-09 Tags: large language models, open source, china, artificial intelligence, mixture of experts, foundation models, natural language processing, chinese state media by klotz

Inside GPT-OSS: OpenAI’s Latest LLM Architecture

An in-depth look at the architecture of OpenAI's GPT-OSS models, detailing tokenization, embeddings, transformer blocks, Mixture of Experts, attention mechanisms (GQA and RoPE), and quantization techniques.

2025-09-27 Tags: llm, gpt-oss, openai, transformer, mixture of experts, moe, attention, gqa, rope, quantization, machine learning, .qwen3–30b-a3b. by klotz

My mind was blown: running a 120B parameter AI model on a budget GPU at home

A 120 billion parameter OpenAI model can now run on consumer hardware thanks to the Mixture of Experts (MoE) technique, which significantly reduces memory requirements and allows processing on CPUs while offloading key parts to modest GPUs.

2025-08-21 Tags: llm, mixture of experts, 120b, gpu, cpu, openai, gpt-oss-120b by klotz

7 things I wish I knew when I started self-hosting LLMs

This article details 7 lessons the author learned while self-hosting Large Language Models (LLMs), covering topics like the importance of memory bandwidth, quantization, electricity costs, hardware choices beyond Nvidia, prompt engineering, Mixture of Experts models, and starting with simpler tools like LM Studio.

2025-07-23 Tags: llm, self-hosting, gpu, quantization, memory bandwidth, ollama, lm studio, mixture of experts by klotz

Non-determinism in GPT-4 is caused by Sparse MoE - 152334H

2023-08-05 Tags: gpt-4, llm, mixture of experts, openai, non-determinism by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: mixture of experts*

Linked Tags

Related Tags