Feature-based Transfer Learning vs Fine Tuning?
There are a lot of deep explanations elsewhere so here I’d like to share some example questions in an interview setting.
What’s the difference between feature-based transfer learning vs. fine tuning?
Here are some example answers for readers’ reference:
Two methods that you can use for transfer learning are the following:
In feature based transfer learning, you can train word embeddings by running a different model and then using those features (i.e. word vectors) on a different task.
When fine tuning, you can use the exact same model and just run it on a different task. Sometimes when fine tuning, you can keep the model weights fixed and just add a new layer that you will train. Other times you can slowly unfreeze the layers one at a time. You can also use unlabelled data when pre-training, by masking words and trying to predict which word was masked.