Member-only story
How Would You Describe a Transformer Model during an Interview?
This was a question I personally experienced before. I remember thinking: oh… where should I begin….
Transformer models have been used everywhere in NLP, computer vision and all kinds of downstream applications today. It’s not unexpected that the interviewer for an ML engineer or ML data scientist role would ask a question like this. There are a lot of deep explanations elsewhere so here I’d like to share tips on what you can say during an interview setting.
How would you describe a transformer model?
Here are some example answers for readers’ reference:
Super high-level answer:
A transformer is a deep learning model that adopts the mechanism of self-attention, differentially weighting the significance of each part of the input data. It is used primarily in the fields of natural language processing (NLP) and computer vision (CV).
The question is not easy to answer succinctly in a few sentences. You can always ask the interviewers how much detail they would like you to go into. For a more detailed review:
Watch the explanation by Professor Christopher Potts from Stanford (Tip: watch a bit longer for a more complete description!):