| Linear Model | No Conv. | SWDE | FDA | SQuaD | NIAH-1 | NIAH-2 | NIAH-3 | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| acc\(\uparrow\) | acc\(\uparrow\) | acc\(\uparrow\) | 1K | 2K | 4K | 8K | 16K | 32K | 1K | 2K | 4K | 8K | 16K | 32K | 1K | 2K | 4K | 8K | 16K | 32K | ||
| GDN | ✘ | 54.6 | 67.2 | 34.5 | 100 | 100 | 100 | 100 | 93.2 | 70.5 | 100 | 100 | 100 | 8.0 | 0.0 | 0.0 | 93.2 | 70.2 | 50.0 | 0.0 | 0.0 | 0.0 |
| Mamba-2 | ✘ | 56.3 | 68.8 | 36.0 | 100 | 100 | 16.4 | 0.0 | 0.0 | 0.0 | 100 | 100 | 85.8 | 0.0 | 0.0 | 0.0 | 76.9 | 80.6 | 60.8 | 0.0 | 0.0 | 0.0 |
| SWA-RoPE | ✔ | 51.0 | 68.1 | 34.1 | 100 | 100 | 100 | 100 | 98.2 | 60.4 | 100 | 100 | 100 | 98.2 | 3.1 | 0.0 | 93.4 | 78.2 | 12.8 | 60.0 | 4.4 | 0.0 |
| Raven | ✔ | 51.4 | 64.2 | 31.4 | 100 | 100 | 100 | 100 | 98.4 | 78.6 | 100 | 100 | 100 | 100 | 95.4 | 65.4 | 90.0 | 67.0 | 73.8 | 60.0 | 10.2 | 14.4 |
| GDN | ✘ | 64.7 | 77.8 | 48.1 | 100 | 100 | 100 | 63.4 | 0.2 | 0.0 | 100 | 100 | 99.0 | 2.2 | 0.0 | 0.0 | 96.8 | 93.8 | 76.2 | 0.0 | 0.0 | 0.0 |
| Mamba-2 | ✘ | 68.5 | 72.9 | 42.3 | 100 | 100 | 0.0 | 0.0 | 0.0 | 0.0 | 100 | 100 | 5.6 | 0.0 | 0.0 | 0.0 | 88.8 | 87.4 | 5.8 | 0.0 | 0.0 | 0.0 |
| SWA-RoPE | ✔ | 63.4 | 68.5 | 17.2 | 100 | 100 | 100 | 100 | 97.6 | 69.0 | 100 | 100 | 98.4 | 3.0 | 0.0 | 0.0 | 67.8 | 36.0 | 17.6 | 6.2 | 0.0 | 0.0 |
| Raven | ✔ | 64.5 | 81.3 | 37.0 | 100 | 100 | 100 | 99.8 | 99.4 | 55.4 | 100 | 100 | 100 | 99.8 | 84.6 | 80.8 | 97.2 | 93.6 | 85.4 | 59.2 | 36.8 | 0.0 |