This article discusses the extensive evaluation of quantized large language models (LLMs) by Neural Magic, finding that quantized LLMs maintain competitive accuracy and efficiency with their full-precision counterparts.
Resource-efficient LLMs and Multimodal Models
A useful survey of resource-efficient LLMs and multimodal foundations models.
Provides a comprehensive analysis and insights into ML efficiency research, including architectures, algorithms, and practical system designs and implementations.