Update README.md
#18 opened about 6 hours ago
by
suhara
Recommended way of fine-tuning?
2
#17 opened about 6 hours ago
by
devon-kindo
Unexpected... "Performance"?
4
#15 opened about 12 hours ago
by
ponzles
doesn't do kv caching on transformers
1
#14 opened about 16 hours ago
by
adaface-neurips
Does not work with dgx spark
🔥
1
1
#13 opened about 22 hours ago
by
sotaaa
Actual context length
2
#12 opened 1 day ago
by
yuchsiao
I really hope this model works
1
#8 opened 2 days ago
by
BVEsun
Simple minesweeper game is failing.
1
#7 opened 3 days ago
by
robert1968
Good model but it is very flawed in recalling input
6
#5 opened 3 days ago
by
cmp-nct
Problem working with long text
5
#4 opened 3 days ago
by
Kosh69
Tool calling with reasoning parsing broken
7
#3 opened 3 days ago
by
nephepritou