SemanticScuttle - klotz.me » klotz: statistical models

Why Do Transformers Fail to Forecast Time Series In-Context?

This paper provides a theoretical analysis of Transformers' limitations for time series forecasting through the lens of In-Context Learning (ICL) theory, demonstrating that even powerful Transformers often fail to outperform simpler models like linear models. The study focuses on Linear Self-Attention (LSA) models and shows that they cannot achieve lower expected MSE than classical linear models for in-context forecasting, and that predictions collapse to the mean exponentially under Chain-of-Thought inference.

2025-10-17 Tags: time series, forecasting, transformers, in-context learning, linear, self-attention, machine learning, statistical models, llm, production engineering by klotz

SemanticScuttle - klotz.me

klotz: statistical models*

Linked Tags

Related Tags