Use ybelkada/Mixtral-8x7B-Instruct-v0.1-AWQ with VLLM instead
1
#10 opened 8 months ago
by
blobpenguin
Inference taking too much time
3
#9 opened 9 months ago
by
tariksetia
Update README.md
#8 opened 9 months ago
by
skoita
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
2
#7 opened 10 months ago
by
aaganaie
TGI - response is an empty string
2
#6 opened 10 months ago
by
p-christ
OC is not a multiple of cta_N = 64
2
#5 opened 10 months ago
by
lazyDataScientist
Not supporting with TGI
1
#4 opened 11 months ago
by
abhishek3jangid
always getting 0 in output
15
#3 opened 11 months ago
by
xubuild
OOM under vLLM even with 80GB GPU
5
#2 opened 11 months ago
by
mike-ravkine
Not supported for TGI > 1.3 ?
20
#1 opened 11 months ago
by
paulcx