The article introduces the LLMOps Database, a curated collection of over 300 real-world Generative AI implementations, focusing on practical challenges and solutions in deploying large language models in production environments. It highlights the importance of sharing technical insights and best practices to bridge the gap between theoretical discussions and practical implementation.
This article discusses the benefits of a disaggregated observability (o11y) stack for modern distributed architectures, addressing issues of flexibility, high cost, and lack of autonomy in traditional solutions. It highlights key layers of a disaggregated stack — agents, collection, storage, and visualization — and suggests the use of systems like Apache Pinot and Grafana.
A list of 13 open-source software for building and managing production-ready AI applications. The tools cover various aspects of AI development, including LLM tool integration, vector databases, RAG pipelines, model training and deployment, LLM routing, data pipelines, AI agent monitoring, LLM observability, and AI app development.
1. Composio - Seamless integration of tools with LLMs.
2. Weaviate - AI-native vector database for AI apps.
3. Haystack - Framework for building efficient RAG pipelines.
4. LitGPT - Pretrain, fine-tune, and deploy models at scale.
5. DsPy - Framework for programming LLMs.
6. Portkey's Gateway - Reliably route to 200+ LLMs with one API.
7. AirByte - Reliable and extensible open-source data pipeline.
8. AgentOps - Agents observability and monitoring.
9. ArizeAI's Phoenix - LLM observability and evaluation.
10. vLLM - Easy, fast, and cheap LLM serving for everyone.
11. Vercel AI SDK - Easily build AI-powered products.
12. LangGraph - Build language agents as graphs.
13. Taipy - Build AI apps in Python.
This article showcases 15 real-world examples of companies using Large Language Models (LLMs) in various industries, such as Netflix, Picnic, Uber, GitLab, LinkedIn, Swiggy, Careem, Slack, Picnic, Foodpanda, Etsy, LinkedIn, Discord, Pinterest, and Expedia.
A research team introduces Super Tiny Language Models (STLMs) to address the resource-intensive nature of large language models, providing high performance with significantly reduced parameter counts.
Best Practices for Running Containers and Kubernetes in Production