Griffin: New LLM Architecture Conquer Long Contexts

Angelina Yang
5 min readMay 10, 2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale.

When it comes to efficient and powerful language models, modeling long contexts and sequences effectively and cutting cost remain significant challenges. Google DeepMind’s innovative Hawk and Griffin models are taking strides in this direction, showing remarkable abilities to leverage extended context windows and presents compelling alternative to traditional Transformer-based approaches.

--

--