SemanticScuttle - klotz.me » klotz: data science

klotz: data science*

"Talk to your data. Instantly analyze, visualize, and transform."

Analyzia is a data analysis tool that allows users to talk to their data, analyze, visualize, and transform CSV files using AI-powered insights without coding. It features natural language queries, Google Gemini integration, professional visualizations, and interactive dashboards, with a conversational interface that remembers previous questions. The tool requires Python 3.11+, a Google API key, and uses Streamlit, LangChain, and various data visualization libraries

2025-11-09 Tags: data analysis, visualization, llm, python, streamlit, langchain, google gemini, csv, data science, machine learning by klotz

The Pearson Correlation Coefficient, Explained Simply

A simple explanation of the Pearson correlation coefficient with examples

2025-11-03 Tags: statistics, data science, machine learning, python, pearson correlation, regression by klotz

Building a Rules Engine from First Principles

This article details how to build a lightweight and efficient rules engine by recasting propositional logic as sparse algebra. It guides readers through the process from theoretical foundations to practical implementation, introducing concepts like state vectors and algebraic operations for logical inference.

2025-11-02 Tags: rules engine, logic, sparse algebra, state algebra, logical inference, data science, algorithm, propositional logic, truth table by klotz

Building a Monitoring System That Actually Works

A step-by-step guide to catching real anomalies without drowning in false alerts.

2025-10-28 Tags: monitoring, kpis, anomalies, statistics, time series, data science, python by klotz

Polars vs pandas: What's the Difference?

This tutorial compares Polars and pandas, covering syntax, performance, LazyFrames, conversions, and plotting to help you choose the right library for your data analysis needs.

2025-10-16 Tags: polars, pandas, data analysis, dataframes, performance, lazyframes, python, data science by klotz

Prompt Engineering for Time-Series Analysis with Large Language Models

This article explores how prompt engineering can be used to improve time-series analysis with Large Language Models (LLMs), covering core strategies, preprocessing, anomaly detection, and feature engineering. It provides practical prompts and examples for various tasks.

2025-10-16 Tags: llm, prompt engineering, time series, forecasting, anomaly detection, feature engineering, data science, machine learning, production engineering, observability by klotz

I Was Wrong: Start Simple, Then Move to More Complex

The author discusses a shift in approach to clustering mixed data, advocating for starting with the simpler Gower distance metric before resorting to more complex embedding techniques like UMAP. They introduce 'Gower Express', an optimized and accelerated implementation of Gower.

2025-09-05 Tags: clustering, data science, machine learning, gower distance, umap, gower express, mixed data, python, scikit-learn, data analysis, shrunk by klotz

Hands On Time Series Modeling of Rare Events, with Python

This article details a hands-on approach to modeling rare events in time series data using Python. It covers data exploration, defining extreme events, fitting distributions (GEV, Weibull, Gumbel), and evaluating model performance using metrics like log-likelihood, AIC, and BIC. The example uses weather data and provides code snippets for implementation.

2025-09-05 Tags: data science, time series, rare events, python, gev, weibull, gumbel, extreme value theory, data visualization, statistics by klotz

A Visual Guide to Tuning Random Forest Hyperparameters

This article explores the impact of hyperparameters on random forests, both in terms of performance and visual representation. It compares the performance of a default random forest with tuned decision trees and examines the effects of various hyperparameters like `n_estimators`, `max_depth`, and `ccp_alpha` using visualizations of individual trees, predictions, and errors.

2025-09-05 Tags: data science, machine learning, random forests, hyperparameter tuning, python, data visualization, scikit-learn, decision trees, james gibbins by klotz

Using Google’s LangExtract and Gemma for Structured Data Extraction

Extracting structured information effectively and accurately from long unstructured text with LangExtract and LLMs. This article explores Google’s LangExtract framework and its open-source LLM, Gemma 3, demonstrating how to parse an insurance policy to surface details like exclusions.

2025-08-27 Tags: data science, large language models, llm, machine learning, structured data, langextract, gemma, data extraction by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: data science*

Linked Tags

Related Tags