Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
barpitf
/
RAT
like
2
HuggingFaceFW/fineweb-edu
English
RAT
efficient architecture
recurrence
attention
pretraining
arxiv:
2507.04416
License:
mit
Model card
Files
Files and versions
xet
Community
main
RAT
76.9 GB
2 contributors
History:
5 commits
barpitf
Update readme
0418435
verified
9 days ago
.gitattributes
1.52 kB
initial commit
10 days ago
README.md
618 Bytes
Update readme
9 days ago
attention.pth
15.3 GB
xet
[ckpt]
10 days ago
attention_localattention_l2.pth
15.3 GB
xet
[ckpt]
10 days ago
ratl16.pth
15.6 GB
xet
[ckpt]
10 days ago
ratl16_localattention_l2.pth
15.4 GB
xet
[ckpt]
10 days ago
rnn.pth
15.3 GB
xet
[ckpt]
10 days ago