SemanticScuttle - klotz.me » Tags: features

Tags: features*

0 bookmark(s) - Sort by: Date ↓ / Title /

Interpretable Causal Diffusion Language Models

Steerling-8B is an interpretable causal diffusion language model that combines masked diffusion language modeling with concept decomposition, enabling generation, attribution, steering, and extraction of hidden representations. It offers features like block-causal attention and decomposition of hidden states into known and unknown concepts.

2026-02-24 Tags: attribution, concepts, models, decomposition, features, diffusion, interpretability, explanations, explainability, llms, generative-ai by klotz

How do neural networks learn? A mathematical formula explains how they detect relevant patterns

Researchers from the University of California San Diego have developed a mathematical formula that explains how neural networks learn and detect relevant patterns in data, providing insight into the mechanisms behind neural network learning and enabling improvements in machine learning efficiency.

2025-01-07 Tags: neural networks, machine learning, features, xai, explainability, llm by klotz

Mapping the latent space of Llama 3.3 70B

Sparse autoencoders (SAEs) have been trained on Llama 3.3 70B, releasing an interpreted model accessible via API, enabling research and product development through feature space exploration and steering.

2024-12-25 Tags: llm, llama 3.3, sparse autoencoders, sae, latent space, features, xai, api, interpretability by klotz

The Geometry of Concepts: Sparse Autoencoder Feature Structure

This paper explores the structure of the feature point cloud discovered by sparse autoencoders in large language models. It investigates three scales: atomic, brain, and galaxy. The atomic scale involves crystal structures with parallelograms or trapezoids, improved by projecting out distractor dimensions. The brain scale focuses on modular structures, similar to neural lobes. The galaxy scale examines the overall shape and clustering of the point cloud.

2024-11-06 Tags: autoencoder, features, llm, scale by klotz

Anthropic decoded the vectors Claude uses to represent abstract concepts

Last week, Anthropic announced a significant breakthrough in our understanding of how large language models work. The research focused on Claude 3 Sonnet, the mid-sized version of Anthropic’s latest frontier model. Anthropic showed that it could transform Claude's otherwise inscrutable numeric representation of words into a combination of ‘features’, many of which can be understood by human beings. The vectors Claude uses to represent words can be understood as the sum of ‘features’—vectors that represent a variety of abstract concepts from immunology to coding errors to the Golden Gate Bridge. This research could prove useful for Anthropic and the broader industry, potentially leading to new tools to detect model misbehavior or prevent it altogether.

2024-06-06 Tags: anthropic, claude, large language model, vectors, features, abstract concepts, ontology by klotz

Mapping the Mind of a Large Language Model May 21, 2024

"...a feature that activates when Claude reads a scam email (this presumably supports the model’s ability to recognize such emails and warn you not to respond to them). Normally, if one asks Claude to generate a scam email, it will refuse to do so. But when we ask the same question with the feature artificially activated sufficiently strongly, this overcomes Claude's harmlessness training and it responds by drafting a scam email."

2024-05-21 Tags: claude, anthropic, llm, ontology, features, semantic web, spam, email by klotz

Harmonics of Learning: A Mathematical Theory for the Rise of Fourier Features in Learning Systems Like Neural Networks

Fourier features in learning systems like neural networks due to the downstream invariance of the learner that becomes insensitive to certain transformations, e.g., planar translation or rotation.

2024-05-17 Tags: ucsb, fourier, features, machine learning, cnn by klotz

With OpenAI’s Release of GPT-4o, Is ChatGPT Plus Still Worth It?

OpenAI's new GPT-4o model is now available for free, but ChatGPT Plus subscribers still get access to more prompts and newer features. This article compares what's available to both free and paid users.

2024-05-15 Tags: openai, chatgpt, gpt-4o, chatgpt plus, features by klotz

Cyclical Encoding: An Alternative to One-Hot Encoding for Time Series Features

This article discusses cyclical encoding as an alternative to one-hot encoding for time series features in machine learning. Cyclical encoding provides the same information to the model with significantly fewer features.

2024-05-04 Tags: machine learning, time series, features, cyclical encoding, one-hot encoding by klotz

Guide to Choosing a New Graphics Card | TechSpot

2023-06-07 Tags: gpu, features, price by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: features*

Linked Tags

Related Tags