-
Raven (Part-2)
Architecture and Results
-
Raven (Part-1)
Memory as a set of Slots
-
On the Legacy of Linear Transformers in Positional Embeddings 📍
Duality of Forget Gates and Position Embeddings in Sequence Modeling
-
LION 🦁 Part IV - Results
Comprehensive results on Vision, MLM and more LION variants
-
LION 🦁 Part III - Chunkwise Parallel Form of LION
Explaining LION-Chunk for Balancing Memory-Speed Tradeoffs During Inference