What is Global Attention in Deep Learning?
Global and local attentions are some of the early attention mechanisms that were inspirations to the famous transformer models today. There are a lot of deep explanations elsewhere so here I’d like to share tips on what you can say during an interview setting.
What is global attention?
Context: In their 2015 paper “Effective Approaches to Attention-based Neural Machine Translation,” Stanford NLP researchers Minh-Thang Luong, et al. propose an attention mechanism for the encoder-decoder model for machine translation called “global attention.”
Here are some example answers for readers’ reference:
Global Attention is one of the simplest attention mechanisms.
The idea of a global attentional model is to consider all the hidden states of the encoder when deriving the context vector k (below) . In this model type, a variable-length alignment vector 𝛼 tilde, whose size equals the number of time steps on the source side, is derived by comparing the current target hidden state hc with each source hidden state hi. The goal is then to derive a context vector k that captures relevant source-side information to help predict the current…