SemanticScuttle - klotz.me » klotz: llamacpp+chat

Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt)

This pull request adds StreamingLLM support for llamacpp and llamacpp_HF models, aiming to improve performance and reliability. The changes allow indefinite chatting with the model without re-evaluating the prompt.

2024-11-26 Tags: streamingllm, llamacpp, llm, chat, oobabooga by klotz

SemanticScuttle - klotz.me

klotz: llamacpp* + chat*

Linked Tags

Related Tags