SemanticScuttle - klotz.me » Tags: explainability

Tags: explainability*

0 bookmark(s) - Sort by: Date / Title ↑ /

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

"scaling sparse autoencoders has been a major priority of the Anthropic interpretability team, and we're pleased to report extracting high-quality features from Claude 3 Sonnet, 1 Anthropic's medium-sized production model.

We find a diversity of highly abstract features. They both respond to and behaviorally cause abstract behaviors. Examples of features we find include features for famous people, features for countries and cities, and features tracking type signatures in code. Many features are multilingual (responding to the same concept across languages) and multimodal (responding to the same concept in both text and images), as well as encompassing both abstract and concrete instantiations of the same idea (such as code with security vulnerabilities, and abstract discussion of security vulnerabilities)."

2024-05-24 Tags: explainability, llm, ontology, anthropic, claude 3 by klotz
Shap

2023-01-28 Tags: shapley functions, shap, explainability, machine learning by klotz
SHAP formula explained the way I wish someone explained it to me

2020-01-05 Tags: granger causality, explainability, machine learning, shapley functions, shap by klotz
SHAP: How to Interpret Machine Learning Models With Python | by Dario Radečić | Nov, 2020 | Towards Data Science

2020-11-10 Tags: shap, explainability, machine learning, tutorial, python by klotz
Shapash makes Machine Learning models understandable by everyone | Towards AI

2021-03-17 Tags: shapash, explainability, shap, lime, machine learning, svm by klotz
Shapley value - Wikipedia

2020-01-05 Tags: shapley functions, shap, machine learning, explainability by klotz
Understanding Causal Trees

2023-02-04 Tags: causality, explainability, machine learning, root cause analysis by klotz
Visualizing Decision Trees with Pybaobabdt | by Parul Pandey | Dec, 2021 | Towards Data Science

2022-01-06 Tags: pybaobabdt, python, decision tree, explainability, visualization, machine learning by klotz
What If We Could Easily Explain Overly Complex Models?

Generating counterfactual explanations got a lot easier with CFNOW, but what are counterfactual explanations, and how can I use them?

2023-09-30 Tags: machine learning, explainability, feature selection, lime, shap, counterfactual, cfnow, xai by klotz
What is the Purpose of Statistical Modelling? · Harvard Data Science Review

2019-07-12 Tags: david_hand, statistics, philosophy_of_science, data, models, inference, machine learning, ai, mental models, explainability, epistemology, data structures, algorithms, knowledge representation by klotz

Top of the page

First / Previous / Next / Last / Page 3 of 0

About - Propulsed by SemanticScuttle