A deep dive into the process of LLM inference, covering tokenization, transformer architecture, KV caching, and optimization techniques for efficient text generation.
An exploration of simple transformer circuit models that illustrate how superposition arises in transformer architectures, introducing toy examples and analyzing their behavior.
The core mechanics of Deep Learning, and how to think the PyTorch way. This guide provides a whirlwind tour of PyTorch’s methodologies and design principles, covering tensors, automatic differentiation, and training custom neural networks.
A unified memory stack that functions as a memristor as well as a ferroelectric capacitor is reported, enabling both energy-efficient inference and learning at the edge.
DeepMind introduces Ithaca, a deep neural network that can restore damaged ancient Greek inscriptions, identify their original location, and help establish their creation date, collaborating with historians to advance understanding of ancient history.
This article discusses the history of AI, the split between neural networks and symbolic AI, and the recent vindication of neurosymbolic AI through the advancements of models like o3 and Grok 4. It argues that combining the strengths of both approaches is crucial for achieving true AI and highlights the resistance to neurosymbolic AI from some leaders in the deep learning field.
This tutorial introduces the essential topics of the PyTorch deep learning library in about one hour. It covers tensors, training neural networks, and training models on multiple GPUs.
This book covers foundational topics within computer vision, with an image processing and machine learning perspective. It aims to build the reader’s intuition through visualizations and is intended for undergraduate and graduate students, as well as experienced practitioners.
Newsweek interview with Yann LeCun, Meta's chief AI scientist, detailing his skepticism of current LLMs and his focus on Joint Embedding Predictive Architecture (JEPA) as the future of AI, emphasizing world modeling and planning capabilities.
AlexNet, a groundbreaking neural network developed in 2012 by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, has been released in source code form by the Computer History Museum in collaboration with Google. This model significantly advanced the field of AI by demonstrating a massive leap in image recognition capabilities.