0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag
This paper analyzes the performance of 20 large language models (LLMs) using two inference libraries: vLLM and HuggingFace Pipelines. The study investigates how hyperparameters influence inference performance and reveals that throughput landscapes are irregular, highlighting the importance of hyperparameter optimization.
First / Previous / Next / Last
/ Page 1 of 0