Implementing Transformer Models
Practical Series, Heinrich-Heine-Universität Düsseldorf, Department of Computer Science, 2024
Detailed study and implementation of the Transformer model, building an intuitive understanding of its unique architecture and the attention mechanism.
- Practical application of computational aspects of the Transformer model, including the scaling of dot products and shared parameter mechanisms, such as embedding vectors.
- Training a Transformer model tailored for machine translation, including the preparation and pre-processing of a translation dataset.
- Exploration of GPU utilisation, parallel training strategies and effective resource allocation for machine learning training.