Time taken is too long
#3
by
anujchopra
- opened
This model has same architecture and number of parameters as OpenChat 3.5 0106. But it takes much longer ( and more computations ). Can anyone help me understand why?
When I compare the time taken to generate n tokens for this model and Openchat, the difference is 10 times. OpenChat is 10 times faster than this.