Implementing Transformer Models

Practical Series, Heinrich-Heine-Universität Düsseldorf, Department of Computer Science, 2024

Detailed study and implementation of the Transformer model, building an intuitive understanding of its unique architecture and the attention mechanism.

  • Practical application of computational aspects of the Transformer model, including the scaling of dot products and shared parameter mechanisms, such as embedding vectors.
  • Training a Transformer model tailored for machine translation, including the preparation and pre-processing of a translation dataset.
  • Exploration of GPU utilisation, parallel training strategies and effective resource allocation for machine learning training.