An article detailing how to build a flexible, explainable, and algorithm-agnostic ML pipeline with MLflow, focusing on preprocessing, model training, and SHAP-based explanations.
ASCVIT V1 aims to make data analysis easier by automating statistical calculations, visualizations, and interpretations.
Includes descriptive statistics, hypothesis tests, regression, time series analysis, clustering, and LLM-powered data interpretation.
- Accepts CSV or Excel files. Provides a data overview including summary statistics, variable types, and data points.
- Histograms, boxplots, pairplots, correlation matrices.
- t-tests, ANOVA, chi-square test.
- Linear, logistic, and multivariate regression.
- Time series analysis.
- k-means, hierarchical clustering, DBSCAN.
Integrates with an LLM (large language model) via Ollama for automated interpretation of statistical results.
This article demonstrates how to use Pandas plotting capabilities for common data visualization tasks, suggesting that Pandas can be sufficient for routine EDA without relying on libraries like Matplotlib.
Exploratory data analysis (EDA) is a powerful technique to understand the structure of word embeddings, the basis of large language models. In this article, we'll apply EDA to GloVe word embeddings and find some interesting insights.
This 20 minute video starts with a quick overview of Maxwell's equations for digital design engineers, and then spends the rest of the time showing an Agilent software product that simulates them using two different modelling techniques, mostly for GHz-le