This article explains how to run inference on a YOLOv8 object detection model using Docker and create a REST API to orchestrate the process. It includes code implementation and a detailed README in the author's GitHub repository for running the API via REST with Docker.
Learn how to build an open LLM app using Hermes 2 Pro, a powerful LLM based on Meta's Llama 3 architecture. This tutorial explains how to deploy Hermes 2 Pro locally, create a function to track flight status using FlightAware API, and integrate it with the LLM.
A tutorial showing you how how to bring real-time data to LLMs through function calling, using OpenAI's latest LLM GTP-4o.
The api_base key can be used to point the OpenAI client library at a different API endpoint.
llama-cpp-python offers a web server which aims to act as a drop-in replacement for the OpenAI API. This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).