Will the inference source code be available?

#43
by jackkwok - opened

I checked the files under "Files" tab. The code is simple and just calls a secret API endpoint. I assume this 60B model is split across multiple GPUs in a cluster. I am curious how it is done behind the scene. Is that code going to be available? If not, are there blog posts on how it is done?

I'm interested in this too, as this space sometimes seems to produce some honestly disturbing answers which I haven't been able to reproduce elsewhere. I'm curious what's exactly going on here

@ysharma could you please share details about the inference API deployment?

I would be interested to know the answer too.

Sign up or log in to comment