inference error on v100

#2
by dlutsniper - opened

i try it on baidu aistudio, But an error occurred

pip install 'lmdeploy>=0.0.9'

Convert the model's layout and store it in the default path, ./workspace.

python3 -m lmdeploy.serve.turbomind.deploy
--model-name internlm-chat-20b
--model-path /home/aistudio/data/data240556
--model-format awq
--group-size 128

inference

python3 -m lmdeploy.turbomind.chat ./workspace

error:
WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer.

[WARNING] gemm_config.in is not found; using default GEMM algo
Aborted (core dumped)

tree -L 5 workspace/

workspace/
โ”œโ”€โ”€ model_repository
โ”‚   โ”œโ”€โ”€ postprocessing -> ../triton_models/postprocessing
โ”‚   โ”œโ”€โ”€ preprocessing -> ../triton_models/preprocessing
โ”‚   โ””โ”€โ”€ turbomind -> ../triton_models/interactive
โ”œโ”€โ”€ service_docker_up.sh
โ””โ”€โ”€ triton_models
โ”œโ”€โ”€ interactive
โ”‚   โ”œโ”€โ”€ 1
โ”‚   โ”‚   โ”œโ”€โ”€ placeholder
โ”‚   โ”‚   โ””โ”€โ”€ weights -> ../../weights
โ”‚   โ””โ”€โ”€ config.pbtxt
โ”œโ”€โ”€ postprocessing
โ”‚   โ”œโ”€โ”€ 1
โ”‚   โ”‚   โ”œโ”€โ”€ pycache
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ model.cpython-310.pyc
โ”‚   โ”‚   โ”œโ”€โ”€ model.py
โ”‚   โ”‚   โ””โ”€โ”€ tokenizer -> ../../tokenizer
โ”‚   โ””โ”€โ”€ config.pbtxt
โ”œโ”€โ”€ preprocessing
โ”‚   โ”œโ”€โ”€ 1
โ”‚   โ”‚   โ”œโ”€โ”€ pycache
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ model.cpython-310.pyc
โ”‚   โ”‚   โ”œโ”€โ”€ model.py
โ”‚   โ”‚   โ””โ”€โ”€ tokenizer -> ../../tokenizer
โ”‚   โ””โ”€โ”€ config.pbtxt
โ”œโ”€โ”€ tokenizer
โ”‚   โ”œโ”€โ”€ config.json
โ”‚   โ”œโ”€โ”€ configuration_internlm.py
โ”‚   โ”œโ”€โ”€ generation_config.json
โ”‚   โ”œโ”€โ”€ modeling_internlm.py
โ”‚   โ”œโ”€โ”€ placeholder
โ”‚   โ”œโ”€โ”€ pytorch_model.bin.index.json
โ”‚   โ”œโ”€โ”€ special_tokens_map.json
โ”‚   โ”œโ”€โ”€ tokenization_internlm.py
โ”‚   โ”œโ”€โ”€ tokenizer.model
โ”‚   โ”œโ”€โ”€ tokenizer.py
โ”‚   โ””โ”€โ”€ tokenizer_config.json
โ””โ”€โ”€ weights
โ”œโ”€โ”€ config.ini
โ”œโ”€โ”€ layers.0.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.0.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.0.attention.wo.0.qweight
โ”œโ”€โ”€ layers.0.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.0.attention_norm.weight
โ”œโ”€โ”€ layers.0.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.0.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.0.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.0.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.0.ffn_norm.weight
โ”œโ”€โ”€ layers.1.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.1.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.1.attention.wo.0.qweight
โ”œโ”€โ”€ layers.1.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.1.attention_norm.weight
โ”œโ”€โ”€ layers.1.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.1.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.1.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.1.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.1.ffn_norm.weight
โ”œโ”€โ”€ layers.10.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.10.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.10.attention.wo.0.qweight
โ”œโ”€โ”€ layers.10.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.10.attention_norm.weight
โ”œโ”€โ”€ layers.10.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.10.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.10.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.10.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.10.ffn_norm.weight
โ”œโ”€โ”€ layers.11.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.11.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.11.attention.wo.0.qweight
โ”œโ”€โ”€ layers.11.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.11.attention_norm.weight
โ”œโ”€โ”€ layers.11.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.11.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.11.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.11.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.11.ffn_norm.weight
โ”œโ”€โ”€ layers.12.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.12.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.12.attention.wo.0.qweight
โ”œโ”€โ”€ layers.12.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.12.attention_norm.weight
โ”œโ”€โ”€ layers.12.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.12.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.12.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.12.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.12.ffn_norm.weight
โ”œโ”€โ”€ layers.13.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.13.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.13.attention.wo.0.qweight
โ”œโ”€โ”€ layers.13.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.13.attention_norm.weight
โ”œโ”€โ”€ layers.13.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.13.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.13.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.13.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.13.ffn_norm.weight
โ”œโ”€โ”€ layers.14.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.14.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.14.attention.wo.0.qweight
โ”œโ”€โ”€ layers.14.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.14.attention_norm.weight
โ”œโ”€โ”€ layers.14.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.14.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.14.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.14.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.14.ffn_norm.weight
โ”œโ”€โ”€ layers.15.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.15.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.15.attention.wo.0.qweight
โ”œโ”€โ”€ layers.15.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.15.attention_norm.weight
โ”œโ”€โ”€ layers.15.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.15.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.15.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.15.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.15.ffn_norm.weight
โ”œโ”€โ”€ layers.16.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.16.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.16.attention.wo.0.qweight
โ”œโ”€โ”€ layers.16.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.16.attention_norm.weight
โ”œโ”€โ”€ layers.16.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.16.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.16.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.16.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.16.ffn_norm.weight
โ”œโ”€โ”€ layers.17.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.17.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.17.attention.wo.0.qweight
โ”œโ”€โ”€ layers.17.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.17.attention_norm.weight
โ”œโ”€โ”€ layers.17.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.17.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.17.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.17.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.17.ffn_norm.weight
โ”œโ”€โ”€ layers.18.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.18.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.18.attention.wo.0.qweight
โ”œโ”€โ”€ layers.18.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.18.attention_norm.weight
โ”œโ”€โ”€ layers.18.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.18.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.18.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.18.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.18.ffn_norm.weight
โ”œโ”€โ”€ layers.19.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.19.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.19.attention.wo.0.qweight
โ”œโ”€โ”€ layers.19.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.19.attention_norm.weight
โ”œโ”€โ”€ layers.19.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.19.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.19.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.19.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.19.ffn_norm.weight
โ”œโ”€โ”€ layers.2.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.2.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.2.attention.wo.0.qweight
โ”œโ”€โ”€ layers.2.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.2.attention_norm.weight
โ”œโ”€โ”€ layers.2.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.2.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.2.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.2.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.2.ffn_norm.weight
โ”œโ”€โ”€ layers.20.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.20.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.20.attention.wo.0.qweight
โ”œโ”€โ”€ layers.20.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.20.attention_norm.weight
โ”œโ”€โ”€ layers.20.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.20.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.20.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.20.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.20.ffn_norm.weight
โ”œโ”€โ”€ layers.21.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.21.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.21.attention.wo.0.qweight
โ”œโ”€โ”€ layers.21.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.21.attention_norm.weight
โ”œโ”€โ”€ layers.21.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.21.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.21.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.21.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.21.ffn_norm.weight
โ”œโ”€โ”€ layers.22.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.22.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.22.attention.wo.0.qweight
โ”œโ”€โ”€ layers.22.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.22.attention_norm.weight
โ”œโ”€โ”€ layers.22.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.22.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.22.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.22.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.22.ffn_norm.weight
โ”œโ”€โ”€ layers.23.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.23.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.23.attention.wo.0.qweight
โ”œโ”€โ”€ layers.23.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.23.attention_norm.weight
โ”œโ”€โ”€ layers.23.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.23.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.23.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.23.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.23.ffn_norm.weight
โ”œโ”€โ”€ layers.24.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.24.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.24.attention.wo.0.qweight
โ”œโ”€โ”€ layers.24.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.24.attention_norm.weight
โ”œโ”€โ”€ layers.24.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.24.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.24.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.24.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.24.ffn_norm.weight
โ”œโ”€โ”€ layers.25.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.25.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.25.attention.wo.0.qweight
โ”œโ”€โ”€ layers.25.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.25.attention_norm.weight
โ”œโ”€โ”€ layers.25.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.25.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.25.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.25.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.25.ffn_norm.weight
โ”œโ”€โ”€ layers.26.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.26.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.26.attention.wo.0.qweight
โ”œโ”€โ”€ layers.26.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.26.attention_norm.weight
โ”œโ”€โ”€ layers.26.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.26.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.26.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.26.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.26.ffn_norm.weight
โ”œโ”€โ”€ layers.27.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.27.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.27.attention.wo.0.qweight
โ”œโ”€โ”€ layers.27.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.27.attention_norm.weight
โ”œโ”€โ”€ layers.27.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.27.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.27.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.27.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.27.ffn_norm.weight
โ”œโ”€โ”€ layers.28.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.28.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.28.attention.wo.0.qweight
โ”œโ”€โ”€ layers.28.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.28.attention_norm.weight
โ”œโ”€โ”€ layers.28.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.28.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.28.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.28.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.28.ffn_norm.weight
โ”œโ”€โ”€ layers.29.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.29.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.29.attention.wo.0.qweight
โ”œโ”€โ”€ layers.29.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.29.attention_norm.weight
โ”œโ”€โ”€ layers.29.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.29.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.29.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.29.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.29.ffn_norm.weight
โ”œโ”€โ”€ layers.3.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.3.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.3.attention.wo.0.qweight
โ”œโ”€โ”€ layers.3.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.3.attention_norm.weight
โ”œโ”€โ”€ layers.3.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.3.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.3.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.3.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.3.ffn_norm.weight
โ”œโ”€โ”€ layers.30.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.30.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.30.attention.wo.0.qweight
โ”œโ”€โ”€ layers.30.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.30.attention_norm.weight
โ”œโ”€โ”€ layers.30.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.30.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.30.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.30.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.30.ffn_norm.weight
โ”œโ”€โ”€ layers.31.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.31.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.31.attention.wo.0.qweight
โ”œโ”€โ”€ layers.31.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.31.attention_norm.weight
โ”œโ”€โ”€ layers.31.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.31.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.31.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.31.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.31.ffn_norm.weight
โ”œโ”€โ”€ layers.32.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.32.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.32.attention.wo.0.qweight
โ”œโ”€โ”€ layers.32.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.32.attention_norm.weight
โ”œโ”€โ”€ layers.32.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.32.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.32.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.32.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.32.ffn_norm.weight
โ”œโ”€โ”€ layers.33.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.33.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.33.attention.wo.0.qweight
โ”œโ”€โ”€ layers.33.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.33.attention_norm.weight
โ”œโ”€โ”€ layers.33.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.33.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.33.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.33.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.33.ffn_norm.weight
โ”œโ”€โ”€ layers.34.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.34.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.34.attention.wo.0.qweight
โ”œโ”€โ”€ layers.34.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.34.attention_norm.weight
โ”œโ”€โ”€ layers.34.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.34.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.34.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.34.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.34.ffn_norm.weight
โ”œโ”€โ”€ layers.35.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.35.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.35.attention.wo.0.qweight
โ”œโ”€โ”€ layers.35.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.35.attention_norm.weight
โ”œโ”€โ”€ layers.35.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.35.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.35.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.35.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.35.ffn_norm.weight
โ”œโ”€โ”€ layers.36.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.36.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.36.attention.wo.0.qweight
โ”œโ”€โ”€ layers.36.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.36.attention_norm.weight
โ”œโ”€โ”€ layers.36.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.36.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.36.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.36.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.36.ffn_norm.weight
โ”œโ”€โ”€ layers.37.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.37.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.37.attention.wo.0.qweight
โ”œโ”€โ”€ layers.37.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.37.attention_norm.weight
โ”œโ”€โ”€ layers.37.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.37.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.37.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.37.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.37.ffn_norm.weight
โ”œโ”€โ”€ layers.38.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.38.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.38.attention.wo.0.qweight
โ”œโ”€โ”€ layers.38.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.38.attention_norm.weight
โ”œโ”€โ”€ layers.38.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.38.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.38.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.38.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.38.ffn_norm.weight
โ”œโ”€โ”€ layers.39.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.39.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.39.attention.wo.0.qweight
โ”œโ”€โ”€ layers.39.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.39.attention_norm.weight
โ”œโ”€โ”€ layers.39.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.39.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.39.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.39.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.39.ffn_norm.weight
โ”œโ”€โ”€ layers.4.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.4.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.4.attention.wo.0.qweight
โ”œโ”€โ”€ layers.4.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.4.attention_norm.weight
โ”œโ”€โ”€ layers.4.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.4.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.4.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.4.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.4.ffn_norm.weight
โ”œโ”€โ”€ layers.40.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.40.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.40.attention.wo.0.qweight
โ”œโ”€โ”€ layers.40.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.40.attention_norm.weight
โ”œโ”€โ”€ layers.40.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.40.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.40.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.40.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.40.ffn_norm.weight
โ”œโ”€โ”€ layers.41.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.41.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.41.attention.wo.0.qweight
โ”œโ”€โ”€ layers.41.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.41.attention_norm.weight
โ”œโ”€โ”€ layers.41.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.41.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.41.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.41.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.41.ffn_norm.weight
โ”œโ”€โ”€ layers.42.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.42.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.42.attention.wo.0.qweight
โ”œโ”€โ”€ layers.42.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.42.attention_norm.weight
โ”œโ”€โ”€ layers.42.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.42.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.42.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.42.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.42.ffn_norm.weight
โ”œโ”€โ”€ layers.43.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.43.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.43.attention.wo.0.qweight
โ”œโ”€โ”€ layers.43.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.43.attention_norm.weight
โ”œโ”€โ”€ layers.43.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.43.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.43.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.43.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.43.ffn_norm.weight
โ”œโ”€โ”€ layers.44.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.44.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.44.attention.wo.0.qweight
โ”œโ”€โ”€ layers.44.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.44.attention_norm.weight
โ”œโ”€โ”€ layers.44.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.44.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.44.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.44.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.44.ffn_norm.weight
โ”œโ”€โ”€ layers.45.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.45.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.45.attention.wo.0.qweight
โ”œโ”€โ”€ layers.45.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.45.attention_norm.weight
โ”œโ”€โ”€ layers.45.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.45.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.45.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.45.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.45.ffn_norm.weight
โ”œโ”€โ”€ layers.46.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.46.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.46.attention.wo.0.qweight
โ”œโ”€โ”€ layers.46.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.46.attention_norm.weight
โ”œโ”€โ”€ layers.46.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.46.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.46.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.46.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.46.ffn_norm.weight
โ”œโ”€โ”€ layers.47.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.47.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.47.attention.wo.0.qweight
โ”œโ”€โ”€ layers.47.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.47.attention_norm.weight
โ”œโ”€โ”€ layers.47.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.47.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.47.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.47.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.47.ffn_norm.weight
โ”œโ”€โ”€ layers.48.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.48.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.48.attention.wo.0.qweight
โ”œโ”€โ”€ layers.48.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.48.attention_norm.weight
โ”œโ”€โ”€ layers.48.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.48.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.48.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.48.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.48.ffn_norm.weight
โ”œโ”€โ”€ layers.49.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.49.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.49.attention.wo.0.qweight
โ”œโ”€โ”€ layers.49.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.49.attention_norm.weight
โ”œโ”€โ”€ layers.49.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.49.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.49.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.49.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.49.ffn_norm.weight
โ”œโ”€โ”€ layers.5.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.5.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.5.attention.wo.0.qweight
โ”œโ”€โ”€ layers.5.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.5.attention_norm.weight
โ”œโ”€โ”€ layers.5.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.5.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.5.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.5.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.5.ffn_norm.weight
โ”œโ”€โ”€ layers.50.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.50.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.50.attention.wo.0.qweight
โ”œโ”€โ”€ layers.50.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.50.attention_norm.weight
โ”œโ”€โ”€ layers.50.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.50.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.50.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.50.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.50.ffn_norm.weight
โ”œโ”€โ”€ layers.51.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.51.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.51.attention.wo.0.qweight
โ”œโ”€โ”€ layers.51.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.51.attention_norm.weight
โ”œโ”€โ”€ layers.51.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.51.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.51.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.51.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.51.ffn_norm.weight
โ”œโ”€โ”€ layers.52.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.52.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.52.attention.wo.0.qweight
โ”œโ”€โ”€ layers.52.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.52.attention_norm.weight
โ”œโ”€โ”€ layers.52.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.52.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.52.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.52.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.52.ffn_norm.weight
โ”œโ”€โ”€ layers.53.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.53.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.53.attention.wo.0.qweight
โ”œโ”€โ”€ layers.53.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.53.attention_norm.weight
โ”œโ”€โ”€ layers.53.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.53.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.53.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.53.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.53.ffn_norm.weight
โ”œโ”€โ”€ layers.54.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.54.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.54.attention.wo.0.qweight
โ”œโ”€โ”€ layers.54.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.54.attention_norm.weight
โ”œโ”€โ”€ layers.54.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.54.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.54.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.54.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.54.ffn_norm.weight
โ”œโ”€โ”€ layers.55.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.55.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.55.attention.wo.0.qweight
โ”œโ”€โ”€ layers.55.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.55.attention_norm.weight
โ”œโ”€โ”€ layers.55.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.55.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.55.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.55.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.55.ffn_norm.weight
โ”œโ”€โ”€ layers.56.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.56.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.56.attention.wo.0.qweight
โ”œโ”€โ”€ layers.56.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.56.attention_norm.weight
โ”œโ”€โ”€ layers.56.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.56.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.56.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.56.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.56.ffn_norm.weight
โ”œโ”€โ”€ layers.57.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.57.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.57.attention.wo.0.qweight
โ”œโ”€โ”€ layers.57.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.57.attention_norm.weight
โ”œโ”€โ”€ layers.57.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.57.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.57.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.57.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.57.ffn_norm.weight
โ”œโ”€โ”€ layers.58.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.58.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.58.attention.wo.0.qweight
โ”œโ”€โ”€ layers.58.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.58.attention_norm.weight
โ”œโ”€โ”€ layers.58.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.58.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.58.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.58.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.58.ffn_norm.weight
โ”œโ”€โ”€ layers.59.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.59.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.59.attention.wo.0.qweight
โ”œโ”€โ”€ layers.59.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.59.attention_norm.weight
โ”œโ”€โ”€ layers.59.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.59.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.59.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.59.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.59.ffn_norm.weight
โ”œโ”€โ”€ layers.6.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.6.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.6.attention.wo.0.qweight
โ”œโ”€โ”€ layers.6.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.6.attention_norm.weight
โ”œโ”€โ”€ layers.6.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.6.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.6.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.6.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.6.ffn_norm.weight
โ”œโ”€โ”€ layers.7.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.7.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.7.attention.wo.0.qweight
โ”œโ”€โ”€ layers.7.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.7.attention_norm.weight
โ”œโ”€โ”€ layers.7.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.7.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.7.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.7.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.7.ffn_norm.weight
โ”œโ”€โ”€ layers.8.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.8.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.8.attention.wo.0.qweight
โ”œโ”€โ”€ layers.8.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.8.attention_norm.weight
โ”œโ”€โ”€ layers.8.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.8.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.8.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.8.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.8.ffn_norm.weight
โ”œโ”€โ”€ layers.9.attention.w_qkv.0.qweight
โ”œโ”€โ”€ layers.9.attention.w_qkv.0.scales_zeros
โ”œโ”€โ”€ layers.9.attention.wo.0.qweight
โ”œโ”€โ”€ layers.9.attention.wo.0.scales_zeros
โ”œโ”€โ”€ layers.9.attention_norm.weight
โ”œโ”€โ”€ layers.9.feed_forward.w13.0.qweight
โ”œโ”€โ”€ layers.9.feed_forward.w13.0.scales_zeros
โ”œโ”€โ”€ layers.9.feed_forward.w2.0.qweight
โ”œโ”€โ”€ layers.9.feed_forward.w2.0.scales_zeros
โ”œโ”€โ”€ layers.9.ffn_norm.weight
โ”œโ”€โ”€ norm.weight
โ”œโ”€โ”€ output.weight
โ””โ”€โ”€ tok_embeddings.weight

18 directories, 624 files

i try it on Tesla V100 32GB๏ผŒit is UP๏ผŒ but something is strange
ๅ›พ็‰‡.png

lmdeploy only support in sm>=80, such as A10, A100, Geforce 30/40 series.

i try it on Tesla V100 32GB๏ผŒit is UP๏ผŒ but something is strange
ๅ›พ็‰‡.png

Tesla V100 is sm_70, it doesn't support lmdeploy

้‡ๅˆฐไธŽdlutsniper็›ธๅŒ็š„้—ฎ้ข˜๏ผŒไฝฟ็”จ็š„gpuๆ˜ฏ4090๏ผŒcudaๆ˜ฏ11.8.

lmdeploy doesn't support 4bit inference on Tesla V100 32GB.

As presented in https://github.com/InternLM/lmdeploy/blob/main/docs/en/w4a16.md:

"LMDeploy supports LLM model inference of 4-bit weight, with the minimum requirement for NVIDIA graphics cards being sm80, such as A10, A100, Geforce 30/40 series."

dlutsniper changed discussion title from inference but error to inference error on v100
dlutsniper changed discussion status to closed

Sign up or log in to comment