Griffin: New LLM Architecture Conquer Long Contexts

Angelina Yang
5 min readMay 10, 2024

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale.

When it comes to efficient and powerful language models, modeling long contexts and sequences effectively and cutting cost remain significant challenges. Google DeepMind’s innovative Hawk and Griffin models are taking strides in this direction, showing remarkable abilities to leverage extended context windows and presents compelling alternative to traditional Transformer-based approaches.

Introducing Griffin and Hawk

DeepMind proposes Hawk, an RNN with a recurrent architecture called the Real-Gated Linear Recurrent Unit (RG-LRU) to enhance the model’s performance on downstream tasks. Hawk surpasses the reported performance of Mamba, another model, on these tasks.

Meanwhile, Griffin is a hybrid model that combines the use of Linear Recurrent Unit with local attention to improve performance on downstream tasks. It achieves comparable performance to Llama-2, despite being trained on significantly fewer tokens. Griffin addresses the challenges of training and scaling recurrent neural networks (RNNs) and demonstrates competitive results with reduced computational requirements.

Gated Linear Recurrences

Gated linear recurrences are a variation of RNNs that incorporate gating mechanisms to control the flow of information through…

--

--