This section details how to load and use multiple models with the llama.cpp server. It covers configuring the server to handle multiple models, the model path format, and considerations for memory usage.
"This is one of the best 13B models I've tested. (for programming, math, logic, etc) speechless-llama2-hermes-orca-platypus-wizardlm-13b"
Models referred to as "GPT 3.5"
GPT-3.5 series is a series of models that was trained on a blend of text and code from before Q4 2021. The following models are in the GPT-3.5 series:
code-davinci-002 is a base model, so good for pure code-completion tasks
text-davinci-002 is an InstructGPT model based on code-davinci-002
text-davinci-003 is an improvement on text-davinci-002
gpt-3.5-turbo-0301 is an improvement on text-davinci-003, optimized for chat