Transformers, explained: Understand the model behind GPT, BERT, and T5