Day 34 encoder only transformers (bert) & decoder only transformers (chatgpt)

Check out our live web application for this program - https://newdaynewlearning.netlify.app/

[!NOTE] There is a game waiting for you today, the best/first answer can win an exciting gift🎁

More about me:

I am just a Colleague of your’ s, Learning and exploring how Math, Business, and Technology can help us to make better decisions in the field of data science.

Topic : Encoder only transformers (BERT) & Decoder only transformers (ChatGPT)

Encoder only transformers (BERT)

undefined

  • Word Embedding : Converts the words into vectors (numbers)
  • Positional Encoding : Helps to keep track of word order
  • Self-Attention : Helps establish relationship among words
  • Combining all the 3 above we get context Aware embeddings that helps cluster similar sentences and documents

Decoder only transformers (ChatGPT)

undefined

  • Word Embedding : Converts the words into vectors (numbers)
  • Positional Encoding : Helps to keep track of word order
  • Masked Self-Attention : Helps establish relationship among words and the prompts

Article Source :

TL;DR : In the world of transformer models, the “encoder” and “decoder” represent two distinct architectural approaches, primarily used in sequence-to-sequence tasks like machine translation. Encoders process an input sequence and compress it into a context vector, while decoders use this context vector to generate an output sequence. The key difference lies in how they handle input and output sequences and the nature of their attention mechanisms.

Notes mentioning this note


Here are all the notes in this garden, along with their links, visualized as a graph.