High-performance deployment of the vLLM serving engine, optimized for serving large language models at scale.
A study investigating whether format restrictions like JSON or XML impact the performance of large language models (LLMs) in tasks like reasoning and domain knowledge comprehension.
Explore the capabilities of GPT-4, our latest large language model, offering improved understanding, generation, and problem-solving abilities. Discover its applications and learn how to integrate it into your projects.
OpenAI's official website featuring news, blog posts, and information about their work on artificial intelligence.
This page provides information about LLooM, a tool that uses raw LLM logits to weave threads in a probabilistic way. It includes instructions on how to use LLooM with various environments, such as vLLM, llama.cpp, and OpenAI. The README also explains the parameters and configurations for LLooM.
Mariya Mansurova explores using CrewAI's multi-agent framework to create a solution for writing documentation based on tables and answering related questions.
This article discusses how to overcome limitations of retrieval-augmented generation (RAG) models by creating an AI assistant using advanced SQL vector queries. The author uses tools such as MyScaleDB, OpenAI, LangChain, Hugging Face and the HackerNews API to develop an application that enhances the accuracy and efficiency of data retrieval process.
Microsoft has deployed GPT-4, a large language model, in an isolated, air-gapped Azure Government Top Secret cloud for use by the Department of Defense. Once accredited, Pentagon officials will be able to use the technology in a secure environment. The tool is expected to help DOD officials deal with vast amounts of data and simplify information sorting. Microsoft is a major investor in OpenAI, the maker of GPT-4 and the popular ChatGPT.
The author tests the new GPT-4o AI from OpenAI on a standard set of coding tests and finds that it delivers good results, but with one surprising issue.
A tutorial showing you how how to bring real-time data to LLMs through function calling, using OpenAI's latest LLM GTP-4o.