runtime error
40 --top_p=1.0 --share=False python generate.py --base_model='t5-large' --prompt_type='simple_instruct' python generate.py --base_model='philschmid/bart-large-cnn-samsum' python generate.py --base_model='philschmid/flan-t5-base-samsum' python generate.py --base_model='facebook/mbart-large-50-many-to-many-mmt' python generate.py --base_model='togethercomputer/GPT-NeoXT-Chat-Base-20B' --prompt_type='human_bot' --lora_weights='GPT-NeoXT-Chat-Base-20B.merged.json.8_epochs.57b2892c53df5b8cefac45f84d019cace803ef26.28' must have 4*48GB GPU and run without 8bit in order for sharding to work with infer_devices=False can also pass --prompt_type='human_bot' and model can somewhat handle instructions without being instruct tuned python generate.py --base_model=decapoda-research/llama-65b-hf --load_8bit=False --infer_devices=False --prompt_type='human_bot' python generate.py --base_model=h2oai/h2ogpt-oig-oasst1-512-6.9b No model defined yet Get OpenAssistant/reward-model-deberta-v3-large-v2 model Traceback (most recent call last): File "app.py", line 1986, in <module> fire.Fire(main) File "/home/user/.local/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/user/.local/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/home/user/.local/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "app.py", line 294, in main go_gradio(**locals()) File "app.py", line 545, in go_gradio smodel, stokenizer, sdevice = get_score_model(**all_kwargs) File "app.py", line 527, in get_score_model smodel, stokenizer, sdevice = get_model(**score_all_kwargs) File "app.py", line 407, in get_model device = get_device() File "app.py", line 301, in get_device raise RuntimeError("only cuda supported") RuntimeError: only cuda supported
Container logs:
Fetching error logs...