mayank-mishra
/

starcoder-GPTQ-4bit-128g

Model card Files Files and versions Community

extreme slowdown and weird output.

by abhimortal6 - opened Jun 4, 2023

Discussion

abhimortal6

Jun 4, 2023

Tried in oobabooga web_ui, not usable in my case
3060ti 8GB VRAM, 24GB RAM

Output is weird it never returns the code.

Output generated in 27.06 seconds (0.30 tokens/s, 8 tokens, context 63, seed 1191894163)
Output generated in 66.12 seconds (0.47 tokens/s, 31 tokens, context 80, seed 1706855517)
Output generated in 386.01 seconds (0.04 tokens/s, 16 tokens, context 131, seed 1791131008)
Output generated in 50.16 seconds (0.48 tokens/s, 24 tokens, context 118, seed 1161001351)
Output generated in 23.89 seconds (0.04 tokens/s, 1 tokens, context 150, seed 1752912455)
Output generated in 202.32 seconds (0.05 tokens/s, 10 tokens, context 169, seed 1726966570)

Nicopara

Jun 5, 2023

This is a quantitized 15b model. Also, how did you get it to run?

abhimortal6

Jun 6, 2023

This is a quantitized 15b model. Also, how did you get it to run?

Sure title says so, quality is decreased marginally though. To run in webui use configs from ->
https://huggingface.co/ShipItMind/starcoder-gptq-4bit-128g

mayank-mishra

Owner Jun 7, 2023

hey, just pushed some new fixes.
Can you give those a try?

abhimortal6

Jun 11, 2023

This comment has been hidden

abhimortal6

Jun 11, 2023

Sorry but where are the updated files? current repo showing last updated month ago

mayank-mishra

Owner Jun 11, 2023

the fixes are in the repo: https://github.com/mayank31398/GPTQ-for-SantaCoder
The weights are same.

abhimortal6

Jun 14, 2023

•

edited Jun 14, 2023

OOM
3060ti 8GB VRAM, 24GB RAM

 python -m santacoder_inference bigcode/starcoder --wbits 4 --groupsize 128 --load starcoder-GPTQ-4bit-128g/model.pt
Traceback (most recent call last):
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/abhi/Documents/starcoder/GPTQ-for-SantaCoder/santacoder_inference.py", line 96, in <module>
    main()
  File "/home/abhi/Documents/starcoder/GPTQ-for-SantaCoder/santacoder_inference.py", line 86, in main
    model = get_santacoder(args.model, args.load, args.wbits, args.groupsize)
  File "/home/abhi/Documents/starcoder/GPTQ-for-SantaCoder/santacoder_inference.py", line 49, in get_santacoder
    model = model.cuda()
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 844, in _apply
    self._buffers[key] = fn(buf)
  File "/home/abhi/miniconda3/envs/gptq/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in <lambda>
    return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 72.00 MiB (GPU 0; 7.78 GiB total capacity; 6.68 GiB already allocated; 75.31 MiB free; 6.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

mayank-mishra

Owner Jun 14, 2023

yeah, its not supposed to work with 3060ti.

abhimortal6

Jun 15, 2023

alright, closing.

abhimortal6 changed discussion status to closed Jun 15, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment