Tags: data science*

0 bookmark(s) - Sort by: Date ↓ / Title /

    • TabPFN is a novel foundation model designed for small- to medium-sized tabular datasets, with up to 10,000 samples and 500 features.
    • It uses a transformer-based architecture and in-context learning (ICL) to outperform traditional gradient-boosted decision trees on these datasets.
  1. The article discusses methods for data scientists to answer 'what if' questions regarding the impact of actions or events without having conducted prior experiments. It focuses on creating counterfactual predictions using machine learning techniques and compares a proposed method with Google's Causal Impact. The approach involves using historical data and control groups to estimate the effect of modifications, addressing challenges such as seasonality, confounders, and temporal drift.

  2. The article explores 11 essential tips for leveraging the full potential of the Pandas library to boost productivity and streamline workflows in handling and analyzing complex datasets. It uses a real-world dataset from Kaggle's Airbnb listings to illustrate techniques such as chunked processing and parallel execution.

  3. Despite its power, partial correlation remains underrated in data science. This tool addresses the main limitation of simple correlation by accounting for the influence of other variables.

  4. This article provides an overview of feature selection in machine learning, detailing methods to maximize model accuracy, minimize computational costs, and introduce a novel method called History-based Feature Selection (HBFS).

  5. Mastering specific Pandas functions can enhance data manipulation skills for data scientists using Python, focusing on less explored methods for data transformation and analysis.

  6. An article discussing ten predictions for the future of data science and artificial intelligence in 2025, covering topics such as AI agents, open-source models, safety, and governance.

  7. Tips on how to get started, write your first article, and get noticed on Medium with a focus on building a portfolio, community, networking, and earning money.

    2024-12-28 Tags: , , by klotz
  8. Discover the best Python libraries of 2024, categorized into general use and AI/ML/data tools, featuring innovative and practical solutions for developers and data scientists.

  9. Turn your Pandas data frame into a knowledge graph using LLMs. Learn how to build your own LLM graph-builder, implement LLMGraphTransformer by LangChain, and perform QA on your knowledge graph.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "data science"

About - Propulsed by SemanticScuttle