Time-Series Forecasting Enters a New Era With Few-Shot Learning
Executive Summary
Google Research has released a breakthrough in time-series forecasting with TimesFM-ICF, a Foundation Model that supports few-shot learning. This marks a significant step toward reducing the need for task-specific training and democratizing access to high-performance forecasting. The model performs on par with traditional fine-tuned models while requiring less data engineering, making it highly attractive for applications across retail, logistics, energy, finance, and beyond.
Few-Shot Learning Moves from Text to Time
Few-shot learning has been primarily associated with large language models (LLMs), which learn from a handful of in-context examples without extensive training. Now, that same paradigm is being applied to a different but equally pervasive challenge — forecasting time series data.
In a paper presented at ICML 2025, Google introduced TimesFM-ICF (In-Context Fine-Tuning), a major evolution of its original Time-series Foundation Model, TimesFM. Where TimesFM offered robust zero-shot forecasting, TimesFM-ICF takes a critical step further by enabling few-shot adaptation at inference time, bridging the gap between flexibility and precision.
This is not just a technical refinement. It’s a re-engineering of how forecasting can be done — more agile, more scalable, and more inclusive.
Why Time-Series Forecasting Matters (and Why It's Hard)
From predicting electricity consumption to stock volatility, time series forecasting is vital across sectors. Traditionally, forecasting models are handcrafted for each new task or dataset — costly, time-consuming, and inefficient for fast-moving businesses.
Foundation Models like TimesFM changed that by enabling zero-shot learning — pre-trained models that could be reused across domains. But zero-shot has limits; adding in just a little targeted information (i.e., few-shot learning) can bring substantial improvements. The problem is doing that without falling back into the cumbersome process of supervised fine-tuning.
Google's answer? A clever mix of continued pre-training, architectural finesse, and token engineering — all tailored to time-series data.
What Makes TimesFM-ICF Different
At the core of TimesFM-ICF’s innovation is its ability to learn from multiple "in-context" examples during inference. Here’s how it works:
-
Special Separator Tokens: These custom tokens help the model differentiate between the input it needs to predict and related historical examples. Think of them as punctuation marks telling the model where one idea ends and another begins.
-
Continued Pre-training: Rather than fine-tuning the entire model from scratch, Google applied continued pre-training on a new dataset filled with in-context examples and separator tokens. This teaches the model to draw from these examples the way an LLM might parse a prompt.
-
Efficient Decoder Architecture: TimesFM-ICF builds on the decoder-only transformer architecture — familiar to LLMs — but adapts it for forecasting with patches of 32 time points per token and output predictions of up to 128 time points.
Performance: As Good As Fine-Tuning, Without the Fuss
The model was evaluated on 23 unseen datasets with multiple independent time series. It achieved an average 6.8% accuracy improvement over the base TimesFM, and notably, matched the performance of fully fine-tuned models (TimesFM-FT).
That’s a big deal. It means organizations can now achieve top-tier forecasting performance without having to embark on costly training regimes.
| Model | Accuracy Gain vs. Base Model |
|---|---|
| TimesFM-Base | — |
| TimesFM-FT | +6.8% |
| TimesFM-ICF | +6.8% |
In addition to accuracy, TimesFM-ICF showed better use of its memory context, meaning it becomes more precise as more examples are fed to it — a property consistent with theoretical expectations but not guaranteed in practice. That positions it favorably against similarly scaled models without in-context capabilities.
Strategic Implications for the AI Ecosystem
For Businesses
TimesFM-ICF eliminates a major bottleneck: the need for custom-tailored forecasting solutions. Retailers can now predict seasonal product demand with a few historical datasets. Energy utilities can update forecasts based on recent locality-specific usage patterns — all without expensive model retraining.
Essentially, the ease of text prompting in LLMs is coming to numerical forecasting.
For the AI Research Community
This work is a notable example of the cross-pollination between NLP and forecasting. It borrows structural ideas from LLMs and adapts them to the structure of time-series data. It also opens up new research questions about how different data modalities — language, images, numbers — can share similar learning protocols.
For Smaller Players and Startups
One of the most exciting elements is that TimesFM-ICF lowers the barrier to sophisticated forecasting. SMEs (small and medium-sized enterprises) without large ML teams can leverage adaptable, high-performance models out of the box — if Google or others release APIs or open models built on this framework.
This is a democratization moment for industrial-grade forecasting.
What's Next? Broader Applications, Smarter Contexts
The potential impact of TimesFM-ICF goes beyond the lab bench:
-
Automated Prompt Engineering: Future research may focus on systems that automatically select the most relevant time series as in-context examples, much like retrieval-based LLMs.
-
Cross-Domain Models: Can this same approach work across hybrid datasets — combining weather, sales, traffic, and social signals — to create more comprehensive forecasters?
-
Embedded into Systems: Forecasting-as-a-Service platforms and business intelligence dashboards could natively integrate models like TimesFM-ICF, enabling real-time updates that adapt automatically to user-selected contexts.
-
Continued Generalization: Expect further research on how pre-training datasets can be diversified, and how separator tokens might evolve to encode metadata or source hierarchy.
Final Thoughts
TimesFM-ICF is more than a better model — it signals a shift in how forecasting can be operationalized. By introducing few-shot capabilities into time-series forecasting, Google brings a powerful balance of generality and specificity. This approach is likely to ripple through AI product development, enterprise analytics, and even how non-technical users interact with predictive systems.
The future of forecasting may not need fine-tuning at all — just a handful of good examples.
📚 Further Reading: