A user shares their optimal settings for running the gpt-oss-120b model on a system with dual RTX 3090 GPUs and 128GB of RAM, aiming for a balance between performance and quality.
A 120 billion parameter OpenAI model can now run on consumer hardware thanks to the Mixture of Experts (MoE) technique, which significantly reduces memory requirements and allows processing on CPUs while offloading key parts to modest GPUs.