klotz: feature engineering*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. PySpark for time-series data, discussing data ingestion, extraction, and visualization with practical implementation code.
  2. This article provides a comprehensive guide to performing exploratory data analysis on time series data, with a focus on feature engineering.
  3. 2021-04-12 Tags: , by klotz
  4. Cool question - and yes, you're right that you can use the summary command to inspect feature_importances for some of the models (e.g. RandomForestClassifier). Other models may not support the same type of summary however.

    You should also check out the FieldSelector algorithm which is really useful for this problem. Under the hood, it uses ANOVA & F-Tests to estimate the linear dependency between variables. Although its univariate (not capturing any interactions between variables), it still can provide a good baseline from choosing a handful of features from hundreds.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: feature engineering

About - Propulsed by SemanticScuttle