I'm getting 0.4 tokens/s on a 4090
#16
by
androtester
- opened
Is this expected? Simple messages take 350-400s for a reply on a 4090.
I get 5-6t/s on a 3090 so that's abnormal. Going to need more info on your specs, what code your running, all that.
I'm using Oobabooga, I have a 4090, 5800x3d, 32 GB RAM, 2TB NVME.
All I did was use the Oobabooga windows installer, it's supposedly handled the dependencies for me.