This pull request adds StreamingLLM support for llamacpp and llamacpp_HF models, aiming to improve performance and reliability. The changes allow indefinite chatting with the model without re-evaluating the prompt.
This is a GitHub repository for a Discord bot named discord-llm-chatbot. This bot allows you to chat with Large Language Models (LLMs) directly in your Discord server. It supports various LLMs, including those from OpenAI API, Mistral API, Anthropic API, and local models like ollama, oobabooga, Jan, LM Studio, etc. The bot offers a reply-based chat system, customizable system prompt, and seamless threading of conversations. It also supports image and text file attachments, and streamed responses.
This article discusses how to test small language models using 3.8B Phi-3 and 8B Llama-3 models on a PC and Raspberry Pi with LlamaCpp and ONNX. Written by Dmitrii Eliuseev.