How to run on Colab's CPU?
Can someone suggest or show me through piece of code that how to run this model (i.e MPT-30B-CHAT) on colab's CPU
Colab has only 12.7 GB of RAM and MPT-30B-CHAT files are almost 60 GB so it's not possible.
@beoswindvip Can you suggest me which other models I can use ?
@beoswindvip Can you suggest me which other models I can use ?
You can run 7B models(4bit or 8bit quantization) on the Colab Free Plan GPU,
Such as https://huggingface.co/TheBloke/vicuna-7B-v1.3-GPTQ .
@swulling Does this or these 7B models can run easily on CPU also ?
@swulling firstly thanku so much and one last question,
from text_generation import InferenceAPIClient
client = InferenceAPIClient("OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5")
complete_answer = ""
for response in client.generate_stream("<|prompter|>Write Job Description for Data Scientist<|endoftext|><|assistant|>"):
print(response.token)
complete_answer += response.token.text
print(complete_answer)
Apart from OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 model as per above piece of code "which other models I can use"?
I suggest choosing a Chat model with a higher ranking to achieve better results.
Ref: https://chat.lmsys.org/ Leaderboard
Apart from OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 model as per above piece of code "which other models I can use"?