Since its introduction in 2017 (Vaswani et al., 2017), the Transformer model has excelled in a wide range of tasks involving natural language processing and computer vision.
We investigate the Transformer model to address an important sequence learning problem in finance: time series forecasting. The underlying idea is to use the attention mechanism and the seq2seq architecture in the Transformer model to capture long-range dependencies and interactions across assets and perform multi-step time series forecasting in finance. The first part of this article systematically reviews the Transformer model while highlighting its strengths and limitations. In particular, we focus on the attention mechanism and the seq2seq architecture, which are at the core of the Transformer model. Inspired by the concept of weak learners in ensemble learning, we identify the diversification benefit of generating a collection of low-complexity models with simple structures and fewer features.
You can now read the full whitepaper at the link below