A comprehensive guide covering the most critical machine learning equations, including probability, linear algebra, optimization, and advanced concepts, with Python implementations.
An Apple study shows that large language models (LLMs) can improve performance by using a checklist-based reinforcement learning scheme, similar to a simple productivity trick of checking one's work.
This article provides a gentle introduction to Q-learning, its principles, and the basic characteristics of its algorithms, presented in a clear and illustrative tone.
The article discusses the evolution of model inference techniques from 2017 to a projected 2025, highlighting the progression from simple frameworks like Flask and FastAPI to more advanced solutions like Triton Inference Server and vLLM. It details the increasing demands on inference infrastructure driven by larger and more complex models, and the need for optimization in areas like throughput, latency, and cost.
DeepMind introduces Ithaca, a deep neural network that can restore damaged ancient Greek inscriptions, identify their original location, and help establish their creation date, collaborating with historians to advance understanding of ancient history.
This blog post details the training of 'Chess Llama', a small Llama model designed to play chess. It covers the inspiration behind the project (Chess GPT), the dataset used (Lichess Elite database), the training process using Huggingface Transformers, and the model's performance (Elo rating of 1350-1400). It also includes links to try the model and view the source code.
A detailed comparison of the architectures of recent large language models (LLMs) including DeepSeek-V3, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen3, SmolLM3, and Kimi 2, focusing on key design choices and their impact on performance and efficiency.
This article discusses the history of AI, the split between neural networks and symbolic AI, and the recent vindication of neurosymbolic AI through the advancements of models like o3 and Grok 4. It argues that combining the strengths of both approaches is crucial for achieving true AI and highlights the resistance to neurosymbolic AI from some leaders in the deep learning field.
This tutorial introduces the essential topics of the PyTorch deep learning library in about one hour. It covers tensors, training neural networks, and training models on multiple GPUs.
This book covers foundational topics within computer vision, with an image processing and machine learning perspective. It aims to build the reader’s intuition through visualizations and is intended for undergraduate and graduate students, as well as experienced practitioners.