What is “Beam Search Decoding” in a Neural Machine Translation Model?
There are a lot of deep explanations elsewhere so here I’d like to share some example questions in an interview setting.
What is the core idea of “beam search decoding” in a neural machine translation model?
Here are some example answers for readers’ reference:
The core idea of beam search decoding is that on each step of the decoder, we keep track of the k most probable partial translations (which are called hypotheses). K is the beam size that is typically 5 to 10 in practice.
Watch the explanation by Dr. Abby See from Stanford:
Happy practicing!
Thanks for reading my newsletter. You can follow me on Linkedin!
Note: There are different angles to answer an interview question. The author of this newsletter does not try to find a reference that answers a question exhaustively. Rather, the author would like to share some quick insights and help the readers to think, practice and do further research as necessary.
Source of video/answers: Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 8 — Translation, Seq2Seq, Attention by Dr. Abby See Natural Language Processing with Attention Models by Deeplearning.ai
Source of images: Medium. Foundations of NLP Explained Visually: Beam Search, How It Works by Ketan Doshi