0 bookmark(s) - Sort by: Date ↓ / Title /
This PR implements the StreamingLLM technique for model loaders, focusing on handling context length and optimizing chat generation speed.
Retrochat is chat application that supports Llama.cpp, Kobold.cpp, and Ollama. It highlights new features, commands for configuration, chat management, and models, and provides a download link for the release.
First / Previous / Next / Last
/ Page 1 of 0