What is “Beam Search Decoding” in a Neural Machine Translation Model?

Angelina Yang
2 min readSep 18, 2022

There are a lot of deep explanations elsewhere so here I’d like to share some example questions in an interview setting.

What is the core idea of “beam search decoding” in a neural machine translation model?

Source: Foundations of NLP Explained Visually: Beam Search, How It Works

Here are some example answers for readers’ reference:

The core idea of beam search decoding is that on each step of the decoder, we keep track of the k most probable partial translations (which are called hypotheses). K is the beam size that is typically 5 to 10 in practice.

Watch the explanation by Dr. Abby See from Stanford:

See the explanation!

Happy practicing!

Thanks for reading my newsletter. You can follow me on Linkedin!

Note: There are different angles to answer an interview question. The author of this newsletter does not try to find a reference that answers a question exhaustively. Rather, the author would like to share some quick insights and help the readers to think, practice and do further research as necessary.

Source of video/answers: Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 8 — Translation, Seq2Seq, Attention by Dr. Abby See Natural Language Processing with Attention Models by Deeplearning.ai

Source of images: Medium. Foundations of NLP Explained Visually: Beam Search, How It Works by Ketan Doshi

Angelina Yang