An Apple study shows that large language models (LLMs) can improve performance by using a checklist-based reinforcement learning scheme, similar to a simple productivity trick of checking one's work.
This blog post details the training of 'Chess Llama', a small Llama model designed to play chess. It covers the inspiration behind the project (Chess GPT), the dataset used (Lichess Elite database), the training process using Huggingface Transformers, and the model's performance (Elo rating of 1350-1400). It also includes links to try the model and view the source code.
Andrej Karpathy discusses the transformative changes in software development driven by large language models (LLMs) and artificial intelligence, comparing the current era to the early days of computing. The article details Software 3.0 as the latest evolution in software development paradigms, where LLMs are programmable systems that interpret natural language prompts.
The article explores the concept of consciousness, arguing for a reductionist approach by Alan J. McComas. It discusses how consciousness is a function of the brain, with neural activities often preceding awareness. The piece also examines the role of large language models in chatbots, highlighting potential manipulative techniques and the need for vigilance in interactions. Additionally, it addresses the integration of AI in research and evolving journal standards.
Creativity and a Jetson Orin Nano Super can help hobbyists build accessible robots that can reason and interact with the world. The article discusses building a robot using accessible hardware like Arduino and Raspberry Pi, eventually upgrading to more capable hardware like the Jetson Orin Nano Super to run a large language model (LLM) onboard.
Researchers tested large language models (LLMs) and humans on a comprehensive battery of theory of mind tasks, revealing differences in their performance on tasks such as understanding false beliefs, recognizing irony, and identifying faux pas.
In this paper, the authors discuss the challenges faced in developing the knowledge stack for the Companion cognitive architecture and share the tools, representations, and practices they have developed to overcome these challenges. They also outline potential next steps to allow Companion agents to manage their own knowledge more effectively.
Explores the dynamic relationship between language, cognition, and the role of Large Language Models (LLMs) in expanding our understanding of the functional significance of language.
The paper proposes a two-phase framework called TnT-LLM to automate the process of end-to-end label generation and assignment for text mining using large language models, where LLMs produce and refine a label taxonomy iteratively using a zero-shot, multi-stage reasoning approach, and are used as data labelers to yield training samples for lightweight supervised classifiers. The framework is applied to the analysis of user intent and conversational domain for Bing Copilot, achieving accurate and relevant label taxonomies and a favorable balance between accuracy and efficiency for classification at scale.