Write Pandas Like a Pro With Method Chaining Pipelines
Master method chaining, assign(), and pipe() to write cleaner, testable, production-ready Pandas code
This article explores five Python scripts designed to streamline and automate the process of feature selection in machine learning projects. Feature selection is crucial for improving model performance, reducing complexity, and identifying the most impactful variables.
The scripts cover techniques like filtering constant features, eliminating redundant features through correlation analysis, identifying significant features using statistical tests, ranking features with model-based importance scores, and optimizing feature subsets with recursive elimination. Each script is practical, minimal, and provides detailed reports to aid in understanding the selection process.
These tools are valuable for data scientists looking to systematically evaluate feature importance and build more efficient and accurate models.
This course takes you from Python fundamentals to AI Agent development, covering core Python, NumPy, Pandas, SQL, Flask, FastAPI, LLMs, and open-source models via HuggingFace.
A gentle introduction to Causal Machine Learning, covering the core concepts, differences from traditional ML, and practical applications with Python.
This article covers five Python scripts designed to automate impactful feature engineering tasks, including encoding categorical features, transforming numerical features, generating interactions, extracting datetime features, and selecting features automatically.
This article details how to build a 100% local MCP (Model Context Protocol) client using LlamaIndex, Ollama, and LightningAI. It provides a code walkthrough and explanation of the process, including setting up an SQLite MCP server and a locally served LLM.
This article is a year-end recap from Towards Data Science (TDS) highlighting the most popular articles published in 2025. The year was heavily focused on AI Agents and their development, with significant interest in related frameworks like MCP and contextual engineering. Beyond agents, Python remained a crucial skill for data professionals, and there was a strong emphasis on career development within the field. The recap also touches on the evolution of RAG (Retrieval-Augmented Generation) into more sophisticated context-aware systems and the importance of optimizing LLM (Large Language Model) costs. TDS also celebrated its growth as an independent publication and its Author Payment
"Talk to your data. Instantly analyze, visualize, and transform."
Analyzia is a data analysis tool that allows users to talk to their data, analyze, visualize, and transform CSV files using AI-powered insights without coding. It features natural language queries, Google Gemini integration, professional visualizations, and interactive dashboards, with a conversational interface that remembers previous questions. The tool requires Python 3.11+, a Google API key, and uses Streamlit, LangChain, and various data visualization libraries
A simple explanation of the Pearson correlation coefficient with examples
A step-by-step guide to catching real anomalies without drowning in false alerts.