Implementing Transformer Models

Practical Series, Heinrich-Heine-Universität Düsseldorf, Department of Computer Science, 2024

Detailed study and implementation of the Transformer model, building an intuitive understanding of its unique architecture and the attention mechanism.

Practical application of computational aspects of the Transformer model, including the scaling of dot products and shared parameter mechanisms, such as embedding vectors.
Training a Transformer model tailored for machine translation, including the preparation and pre-processing of a translation dataset.
Exploration of GPU utilisation, parallel training strategies and effective resource allocation for machine learning training.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Carel van Niekerk

Share on