klotz: forecasting*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Timer-S1 is a scalable Mixture-of-Experts time series model with 8.3B parameters that uses serial scaling and novel TimeMoE blocks to improve long-term forecasting accuracy.
    We introduce Timer-S1, a strong Mixture-of-Experts (MoE) time series foundation model with 8.3B total parameters, 0.75B activated parameters for each token, and a context length of 11.5K. To overcome the scalability bottleneck in existing pre-trained time series foundation models, we perform Serial Scaling in three dimensions: model architecture, dataset, and training pipeline. Timer-S1 integrates sparse TimeMoE blocks and generic TimeSTP blocks for Serial-Token Prediction (STP), a generic training objective that adheres to the serial nature of forecasting. The proposed paradigm introduces serial computations to improve long-term predictions while avoiding costly rolling-style inference and pronounced error accumulation in the standard next-token prediction. Pursuing a high-quality and unbiased training dataset, we curate TimeBench, a corpus with one trillion time points, and apply meticulous data augmentation to mitigate predictive bias. We further pioneer a post-training stage, including continued pre-training and long-context extension, to enhance short-term and long-context performance. Evaluated on the large-scale GIFT-Eval leaderboard, Timer-S1 achieves state-of-the-art forecasting performance, attaining the best MASE and CRPS scores as a pre-trained model. Timer-S1 will be released to facilitate further research.
  2. This tutorial explores how to use LLM embeddings as features in time series forecasting models. It covers generating embeddings from time series descriptions, preparing data, and evaluating the performance of models with and without LLM embeddings.
  3. This paper provides a theoretical analysis of Transformers' limitations for time series forecasting through the lens of In-Context Learning (ICL) theory, demonstrating that even powerful Transformers often fail to outperform simpler models like linear models. The study focuses on Linear Self-Attention (LSA) models and shows that they cannot achieve lower expected MSE than classical linear models for in-context forecasting, and that predictions collapse to the mean exponentially under Chain-of-Thought inference.
  4. This article explores how prompt engineering can be used to improve time-series analysis with Large Language Models (LLMs), covering core strategies, preprocessing, anomaly detection, and feature engineering. It provides practical prompts and examples for various tasks.
  5. IBM’s new foundation model, TSPulse, can go beyond standard forecasting tasks to detect anomalies, fill in missing values, classify data, and search recurring patterns. It’s also tiny enough to run on a laptop.
  6. This paper introduces Toto, a time series forecasting foundation model with 151 million parameters, and BOOM, a large-scale benchmark for observability time series data. Toto uses a decoder-only architecture and is trained on a large corpus of observability, open, and synthetic data. Both Toto and BOOM are open-sourced under the Apache 2.0 License.
  7. This article provides a roundup of notable time-series forecasting papers published between 2023 and 2024. It highlights five influential papers, including a case study from the online fashion industry, a review on forecasting reconciliation, and new deep learning models like TSMixer and CARD. The article emphasizes advancements in forecasting models, handling challenges in retail forecasting, and improvements in hierarchical forecasting methods.
  8. The article discusses methods for data scientists to answer 'what if' questions regarding the impact of actions or events without having conducted prior experiments. It focuses on creating counterfactual predictions using machine learning techniques and compares a proposed method with Google's Causal Impact. The approach involves using historical data and control groups to estimate the effect of modifications, addressing challenges such as seasonality, confounders, and temporal drift.
  9. TimesFM is a pretrained time-series foundation model developed by Google Research for time-series forecasting, focusing on point forecasts for univariate time series up to 512 time points with any horizon length and an optional frequency indicator.
  10. This article discusses Time-MOE, an open-source time-series foundation model using Mixture-of-Experts (MOE) to improve forecasting accuracy while reducing computational costs. Key contributions include the Time-300B dataset, scaling laws for time series, and the Time-MOE architecture.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: forecasting

About - Propulsed by SemanticScuttle