SemanticScuttle - klotz.me » klotz: gpt-oss-120b

klotz: gpt-oss-120b*

Optimal settings for running gpt-oss-120b on 2x 3090s and 128gb system ram

A user shares their optimal settings for running the gpt-oss-120b model on a system with dual RTX 3090 GPUs and 128GB of RAM, aiming for a balance between performance and quality.

2025-12-04 Tags: gpt-oss-120b, localllama, llm, gpu, rtx 3090, lm studio, optimization, settings by klotz
120B runs awesome on just 8GB VRAM!

A user demonstrates how to run a 120B model efficiently on hardware with only 8GB VRAM by offloading MOE layers to CPU and keeping only attention layers on GPU, achieving high performance with minimal VRAM usage.

2025-08-21 Tags: 120b, moe, llama.cpp, gpt-oss, localllama, gpt-oss-120b, openai, llm by klotz
My mind was blown: running a 120B parameter AI model on a budget GPU at home

A 120 billion parameter OpenAI model can now run on consumer hardware thanks to the Mixture of Experts (MoE) technique, which significantly reduces memory requirements and allows processing on CPUs while offloading key parts to modest GPUs.

2025-08-21 Tags: llm, mixture of experts, 120b, gpu, cpu, openai, gpt-oss-120b by klotz

First / Previous / Next / Last / Page 1 of 0