inference error on v100
i try it on baidu aistudio, But an error occurred
pip install 'lmdeploy>=0.0.9'
Convert the model's layout and store it in the default path, ./workspace.
python3 -m lmdeploy.serve.turbomind.deploy
--model-name internlm-chat-20b
--model-path /home/aistudio/data/data240556
--model-format awq
--group-size 128
inference
python3 -m lmdeploy.turbomind.chat ./workspace
error:
WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer.
[WARNING] gemm_config.in is not found; using default GEMM algo
Aborted (core dumped)
tree -L 5 workspace/
workspace/
โโโ model_repository
โ โโโ postprocessing -> ../triton_models/postprocessing
โ โโโ preprocessing -> ../triton_models/preprocessing
โ โโโ turbomind -> ../triton_models/interactive
โโโ service_docker_up.sh
โโโ triton_models
โโโ interactive
โ โโโ 1
โ โ โโโ placeholder
โ โ โโโ weights -> ../../weights
โ โโโ config.pbtxt
โโโ postprocessing
โ โโโ 1
โ โ โโโ pycache
โ โ โ โโโ model.cpython-310.pyc
โ โ โโโ model.py
โ โ โโโ tokenizer -> ../../tokenizer
โ โโโ config.pbtxt
โโโ preprocessing
โ โโโ 1
โ โ โโโ pycache
โ โ โ โโโ model.cpython-310.pyc
โ โ โโโ model.py
โ โ โโโ tokenizer -> ../../tokenizer
โ โโโ config.pbtxt
โโโ tokenizer
โ โโโ config.json
โ โโโ configuration_internlm.py
โ โโโ generation_config.json
โ โโโ modeling_internlm.py
โ โโโ placeholder
โ โโโ pytorch_model.bin.index.json
โ โโโ special_tokens_map.json
โ โโโ tokenization_internlm.py
โ โโโ tokenizer.model
โ โโโ tokenizer.py
โ โโโ tokenizer_config.json
โโโ weights
โโโ config.ini
โโโ layers.0.attention.w_qkv.0.qweight
โโโ layers.0.attention.w_qkv.0.scales_zeros
โโโ layers.0.attention.wo.0.qweight
โโโ layers.0.attention.wo.0.scales_zeros
โโโ layers.0.attention_norm.weight
โโโ layers.0.feed_forward.w13.0.qweight
โโโ layers.0.feed_forward.w13.0.scales_zeros
โโโ layers.0.feed_forward.w2.0.qweight
โโโ layers.0.feed_forward.w2.0.scales_zeros
โโโ layers.0.ffn_norm.weight
โโโ layers.1.attention.w_qkv.0.qweight
โโโ layers.1.attention.w_qkv.0.scales_zeros
โโโ layers.1.attention.wo.0.qweight
โโโ layers.1.attention.wo.0.scales_zeros
โโโ layers.1.attention_norm.weight
โโโ layers.1.feed_forward.w13.0.qweight
โโโ layers.1.feed_forward.w13.0.scales_zeros
โโโ layers.1.feed_forward.w2.0.qweight
โโโ layers.1.feed_forward.w2.0.scales_zeros
โโโ layers.1.ffn_norm.weight
โโโ layers.10.attention.w_qkv.0.qweight
โโโ layers.10.attention.w_qkv.0.scales_zeros
โโโ layers.10.attention.wo.0.qweight
โโโ layers.10.attention.wo.0.scales_zeros
โโโ layers.10.attention_norm.weight
โโโ layers.10.feed_forward.w13.0.qweight
โโโ layers.10.feed_forward.w13.0.scales_zeros
โโโ layers.10.feed_forward.w2.0.qweight
โโโ layers.10.feed_forward.w2.0.scales_zeros
โโโ layers.10.ffn_norm.weight
โโโ layers.11.attention.w_qkv.0.qweight
โโโ layers.11.attention.w_qkv.0.scales_zeros
โโโ layers.11.attention.wo.0.qweight
โโโ layers.11.attention.wo.0.scales_zeros
โโโ layers.11.attention_norm.weight
โโโ layers.11.feed_forward.w13.0.qweight
โโโ layers.11.feed_forward.w13.0.scales_zeros
โโโ layers.11.feed_forward.w2.0.qweight
โโโ layers.11.feed_forward.w2.0.scales_zeros
โโโ layers.11.ffn_norm.weight
โโโ layers.12.attention.w_qkv.0.qweight
โโโ layers.12.attention.w_qkv.0.scales_zeros
โโโ layers.12.attention.wo.0.qweight
โโโ layers.12.attention.wo.0.scales_zeros
โโโ layers.12.attention_norm.weight
โโโ layers.12.feed_forward.w13.0.qweight
โโโ layers.12.feed_forward.w13.0.scales_zeros
โโโ layers.12.feed_forward.w2.0.qweight
โโโ layers.12.feed_forward.w2.0.scales_zeros
โโโ layers.12.ffn_norm.weight
โโโ layers.13.attention.w_qkv.0.qweight
โโโ layers.13.attention.w_qkv.0.scales_zeros
โโโ layers.13.attention.wo.0.qweight
โโโ layers.13.attention.wo.0.scales_zeros
โโโ layers.13.attention_norm.weight
โโโ layers.13.feed_forward.w13.0.qweight
โโโ layers.13.feed_forward.w13.0.scales_zeros
โโโ layers.13.feed_forward.w2.0.qweight
โโโ layers.13.feed_forward.w2.0.scales_zeros
โโโ layers.13.ffn_norm.weight
โโโ layers.14.attention.w_qkv.0.qweight
โโโ layers.14.attention.w_qkv.0.scales_zeros
โโโ layers.14.attention.wo.0.qweight
โโโ layers.14.attention.wo.0.scales_zeros
โโโ layers.14.attention_norm.weight
โโโ layers.14.feed_forward.w13.0.qweight
โโโ layers.14.feed_forward.w13.0.scales_zeros
โโโ layers.14.feed_forward.w2.0.qweight
โโโ layers.14.feed_forward.w2.0.scales_zeros
โโโ layers.14.ffn_norm.weight
โโโ layers.15.attention.w_qkv.0.qweight
โโโ layers.15.attention.w_qkv.0.scales_zeros
โโโ layers.15.attention.wo.0.qweight
โโโ layers.15.attention.wo.0.scales_zeros
โโโ layers.15.attention_norm.weight
โโโ layers.15.feed_forward.w13.0.qweight
โโโ layers.15.feed_forward.w13.0.scales_zeros
โโโ layers.15.feed_forward.w2.0.qweight
โโโ layers.15.feed_forward.w2.0.scales_zeros
โโโ layers.15.ffn_norm.weight
โโโ layers.16.attention.w_qkv.0.qweight
โโโ layers.16.attention.w_qkv.0.scales_zeros
โโโ layers.16.attention.wo.0.qweight
โโโ layers.16.attention.wo.0.scales_zeros
โโโ layers.16.attention_norm.weight
โโโ layers.16.feed_forward.w13.0.qweight
โโโ layers.16.feed_forward.w13.0.scales_zeros
โโโ layers.16.feed_forward.w2.0.qweight
โโโ layers.16.feed_forward.w2.0.scales_zeros
โโโ layers.16.ffn_norm.weight
โโโ layers.17.attention.w_qkv.0.qweight
โโโ layers.17.attention.w_qkv.0.scales_zeros
โโโ layers.17.attention.wo.0.qweight
โโโ layers.17.attention.wo.0.scales_zeros
โโโ layers.17.attention_norm.weight
โโโ layers.17.feed_forward.w13.0.qweight
โโโ layers.17.feed_forward.w13.0.scales_zeros
โโโ layers.17.feed_forward.w2.0.qweight
โโโ layers.17.feed_forward.w2.0.scales_zeros
โโโ layers.17.ffn_norm.weight
โโโ layers.18.attention.w_qkv.0.qweight
โโโ layers.18.attention.w_qkv.0.scales_zeros
โโโ layers.18.attention.wo.0.qweight
โโโ layers.18.attention.wo.0.scales_zeros
โโโ layers.18.attention_norm.weight
โโโ layers.18.feed_forward.w13.0.qweight
โโโ layers.18.feed_forward.w13.0.scales_zeros
โโโ layers.18.feed_forward.w2.0.qweight
โโโ layers.18.feed_forward.w2.0.scales_zeros
โโโ layers.18.ffn_norm.weight
โโโ layers.19.attention.w_qkv.0.qweight
โโโ layers.19.attention.w_qkv.0.scales_zeros
โโโ layers.19.attention.wo.0.qweight
โโโ layers.19.attention.wo.0.scales_zeros
โโโ layers.19.attention_norm.weight
โโโ layers.19.feed_forward.w13.0.qweight
โโโ layers.19.feed_forward.w13.0.scales_zeros
โโโ layers.19.feed_forward.w2.0.qweight
โโโ layers.19.feed_forward.w2.0.scales_zeros
โโโ layers.19.ffn_norm.weight
โโโ layers.2.attention.w_qkv.0.qweight
โโโ layers.2.attention.w_qkv.0.scales_zeros
โโโ layers.2.attention.wo.0.qweight
โโโ layers.2.attention.wo.0.scales_zeros
โโโ layers.2.attention_norm.weight
โโโ layers.2.feed_forward.w13.0.qweight
โโโ layers.2.feed_forward.w13.0.scales_zeros
โโโ layers.2.feed_forward.w2.0.qweight
โโโ layers.2.feed_forward.w2.0.scales_zeros
โโโ layers.2.ffn_norm.weight
โโโ layers.20.attention.w_qkv.0.qweight
โโโ layers.20.attention.w_qkv.0.scales_zeros
โโโ layers.20.attention.wo.0.qweight
โโโ layers.20.attention.wo.0.scales_zeros
โโโ layers.20.attention_norm.weight
โโโ layers.20.feed_forward.w13.0.qweight
โโโ layers.20.feed_forward.w13.0.scales_zeros
โโโ layers.20.feed_forward.w2.0.qweight
โโโ layers.20.feed_forward.w2.0.scales_zeros
โโโ layers.20.ffn_norm.weight
โโโ layers.21.attention.w_qkv.0.qweight
โโโ layers.21.attention.w_qkv.0.scales_zeros
โโโ layers.21.attention.wo.0.qweight
โโโ layers.21.attention.wo.0.scales_zeros
โโโ layers.21.attention_norm.weight
โโโ layers.21.feed_forward.w13.0.qweight
โโโ layers.21.feed_forward.w13.0.scales_zeros
โโโ layers.21.feed_forward.w2.0.qweight
โโโ layers.21.feed_forward.w2.0.scales_zeros
โโโ layers.21.ffn_norm.weight
โโโ layers.22.attention.w_qkv.0.qweight
โโโ layers.22.attention.w_qkv.0.scales_zeros
โโโ layers.22.attention.wo.0.qweight
โโโ layers.22.attention.wo.0.scales_zeros
โโโ layers.22.attention_norm.weight
โโโ layers.22.feed_forward.w13.0.qweight
โโโ layers.22.feed_forward.w13.0.scales_zeros
โโโ layers.22.feed_forward.w2.0.qweight
โโโ layers.22.feed_forward.w2.0.scales_zeros
โโโ layers.22.ffn_norm.weight
โโโ layers.23.attention.w_qkv.0.qweight
โโโ layers.23.attention.w_qkv.0.scales_zeros
โโโ layers.23.attention.wo.0.qweight
โโโ layers.23.attention.wo.0.scales_zeros
โโโ layers.23.attention_norm.weight
โโโ layers.23.feed_forward.w13.0.qweight
โโโ layers.23.feed_forward.w13.0.scales_zeros
โโโ layers.23.feed_forward.w2.0.qweight
โโโ layers.23.feed_forward.w2.0.scales_zeros
โโโ layers.23.ffn_norm.weight
โโโ layers.24.attention.w_qkv.0.qweight
โโโ layers.24.attention.w_qkv.0.scales_zeros
โโโ layers.24.attention.wo.0.qweight
โโโ layers.24.attention.wo.0.scales_zeros
โโโ layers.24.attention_norm.weight
โโโ layers.24.feed_forward.w13.0.qweight
โโโ layers.24.feed_forward.w13.0.scales_zeros
โโโ layers.24.feed_forward.w2.0.qweight
โโโ layers.24.feed_forward.w2.0.scales_zeros
โโโ layers.24.ffn_norm.weight
โโโ layers.25.attention.w_qkv.0.qweight
โโโ layers.25.attention.w_qkv.0.scales_zeros
โโโ layers.25.attention.wo.0.qweight
โโโ layers.25.attention.wo.0.scales_zeros
โโโ layers.25.attention_norm.weight
โโโ layers.25.feed_forward.w13.0.qweight
โโโ layers.25.feed_forward.w13.0.scales_zeros
โโโ layers.25.feed_forward.w2.0.qweight
โโโ layers.25.feed_forward.w2.0.scales_zeros
โโโ layers.25.ffn_norm.weight
โโโ layers.26.attention.w_qkv.0.qweight
โโโ layers.26.attention.w_qkv.0.scales_zeros
โโโ layers.26.attention.wo.0.qweight
โโโ layers.26.attention.wo.0.scales_zeros
โโโ layers.26.attention_norm.weight
โโโ layers.26.feed_forward.w13.0.qweight
โโโ layers.26.feed_forward.w13.0.scales_zeros
โโโ layers.26.feed_forward.w2.0.qweight
โโโ layers.26.feed_forward.w2.0.scales_zeros
โโโ layers.26.ffn_norm.weight
โโโ layers.27.attention.w_qkv.0.qweight
โโโ layers.27.attention.w_qkv.0.scales_zeros
โโโ layers.27.attention.wo.0.qweight
โโโ layers.27.attention.wo.0.scales_zeros
โโโ layers.27.attention_norm.weight
โโโ layers.27.feed_forward.w13.0.qweight
โโโ layers.27.feed_forward.w13.0.scales_zeros
โโโ layers.27.feed_forward.w2.0.qweight
โโโ layers.27.feed_forward.w2.0.scales_zeros
โโโ layers.27.ffn_norm.weight
โโโ layers.28.attention.w_qkv.0.qweight
โโโ layers.28.attention.w_qkv.0.scales_zeros
โโโ layers.28.attention.wo.0.qweight
โโโ layers.28.attention.wo.0.scales_zeros
โโโ layers.28.attention_norm.weight
โโโ layers.28.feed_forward.w13.0.qweight
โโโ layers.28.feed_forward.w13.0.scales_zeros
โโโ layers.28.feed_forward.w2.0.qweight
โโโ layers.28.feed_forward.w2.0.scales_zeros
โโโ layers.28.ffn_norm.weight
โโโ layers.29.attention.w_qkv.0.qweight
โโโ layers.29.attention.w_qkv.0.scales_zeros
โโโ layers.29.attention.wo.0.qweight
โโโ layers.29.attention.wo.0.scales_zeros
โโโ layers.29.attention_norm.weight
โโโ layers.29.feed_forward.w13.0.qweight
โโโ layers.29.feed_forward.w13.0.scales_zeros
โโโ layers.29.feed_forward.w2.0.qweight
โโโ layers.29.feed_forward.w2.0.scales_zeros
โโโ layers.29.ffn_norm.weight
โโโ layers.3.attention.w_qkv.0.qweight
โโโ layers.3.attention.w_qkv.0.scales_zeros
โโโ layers.3.attention.wo.0.qweight
โโโ layers.3.attention.wo.0.scales_zeros
โโโ layers.3.attention_norm.weight
โโโ layers.3.feed_forward.w13.0.qweight
โโโ layers.3.feed_forward.w13.0.scales_zeros
โโโ layers.3.feed_forward.w2.0.qweight
โโโ layers.3.feed_forward.w2.0.scales_zeros
โโโ layers.3.ffn_norm.weight
โโโ layers.30.attention.w_qkv.0.qweight
โโโ layers.30.attention.w_qkv.0.scales_zeros
โโโ layers.30.attention.wo.0.qweight
โโโ layers.30.attention.wo.0.scales_zeros
โโโ layers.30.attention_norm.weight
โโโ layers.30.feed_forward.w13.0.qweight
โโโ layers.30.feed_forward.w13.0.scales_zeros
โโโ layers.30.feed_forward.w2.0.qweight
โโโ layers.30.feed_forward.w2.0.scales_zeros
โโโ layers.30.ffn_norm.weight
โโโ layers.31.attention.w_qkv.0.qweight
โโโ layers.31.attention.w_qkv.0.scales_zeros
โโโ layers.31.attention.wo.0.qweight
โโโ layers.31.attention.wo.0.scales_zeros
โโโ layers.31.attention_norm.weight
โโโ layers.31.feed_forward.w13.0.qweight
โโโ layers.31.feed_forward.w13.0.scales_zeros
โโโ layers.31.feed_forward.w2.0.qweight
โโโ layers.31.feed_forward.w2.0.scales_zeros
โโโ layers.31.ffn_norm.weight
โโโ layers.32.attention.w_qkv.0.qweight
โโโ layers.32.attention.w_qkv.0.scales_zeros
โโโ layers.32.attention.wo.0.qweight
โโโ layers.32.attention.wo.0.scales_zeros
โโโ layers.32.attention_norm.weight
โโโ layers.32.feed_forward.w13.0.qweight
โโโ layers.32.feed_forward.w13.0.scales_zeros
โโโ layers.32.feed_forward.w2.0.qweight
โโโ layers.32.feed_forward.w2.0.scales_zeros
โโโ layers.32.ffn_norm.weight
โโโ layers.33.attention.w_qkv.0.qweight
โโโ layers.33.attention.w_qkv.0.scales_zeros
โโโ layers.33.attention.wo.0.qweight
โโโ layers.33.attention.wo.0.scales_zeros
โโโ layers.33.attention_norm.weight
โโโ layers.33.feed_forward.w13.0.qweight
โโโ layers.33.feed_forward.w13.0.scales_zeros
โโโ layers.33.feed_forward.w2.0.qweight
โโโ layers.33.feed_forward.w2.0.scales_zeros
โโโ layers.33.ffn_norm.weight
โโโ layers.34.attention.w_qkv.0.qweight
โโโ layers.34.attention.w_qkv.0.scales_zeros
โโโ layers.34.attention.wo.0.qweight
โโโ layers.34.attention.wo.0.scales_zeros
โโโ layers.34.attention_norm.weight
โโโ layers.34.feed_forward.w13.0.qweight
โโโ layers.34.feed_forward.w13.0.scales_zeros
โโโ layers.34.feed_forward.w2.0.qweight
โโโ layers.34.feed_forward.w2.0.scales_zeros
โโโ layers.34.ffn_norm.weight
โโโ layers.35.attention.w_qkv.0.qweight
โโโ layers.35.attention.w_qkv.0.scales_zeros
โโโ layers.35.attention.wo.0.qweight
โโโ layers.35.attention.wo.0.scales_zeros
โโโ layers.35.attention_norm.weight
โโโ layers.35.feed_forward.w13.0.qweight
โโโ layers.35.feed_forward.w13.0.scales_zeros
โโโ layers.35.feed_forward.w2.0.qweight
โโโ layers.35.feed_forward.w2.0.scales_zeros
โโโ layers.35.ffn_norm.weight
โโโ layers.36.attention.w_qkv.0.qweight
โโโ layers.36.attention.w_qkv.0.scales_zeros
โโโ layers.36.attention.wo.0.qweight
โโโ layers.36.attention.wo.0.scales_zeros
โโโ layers.36.attention_norm.weight
โโโ layers.36.feed_forward.w13.0.qweight
โโโ layers.36.feed_forward.w13.0.scales_zeros
โโโ layers.36.feed_forward.w2.0.qweight
โโโ layers.36.feed_forward.w2.0.scales_zeros
โโโ layers.36.ffn_norm.weight
โโโ layers.37.attention.w_qkv.0.qweight
โโโ layers.37.attention.w_qkv.0.scales_zeros
โโโ layers.37.attention.wo.0.qweight
โโโ layers.37.attention.wo.0.scales_zeros
โโโ layers.37.attention_norm.weight
โโโ layers.37.feed_forward.w13.0.qweight
โโโ layers.37.feed_forward.w13.0.scales_zeros
โโโ layers.37.feed_forward.w2.0.qweight
โโโ layers.37.feed_forward.w2.0.scales_zeros
โโโ layers.37.ffn_norm.weight
โโโ layers.38.attention.w_qkv.0.qweight
โโโ layers.38.attention.w_qkv.0.scales_zeros
โโโ layers.38.attention.wo.0.qweight
โโโ layers.38.attention.wo.0.scales_zeros
โโโ layers.38.attention_norm.weight
โโโ layers.38.feed_forward.w13.0.qweight
โโโ layers.38.feed_forward.w13.0.scales_zeros
โโโ layers.38.feed_forward.w2.0.qweight
โโโ layers.38.feed_forward.w2.0.scales_zeros
โโโ layers.38.ffn_norm.weight
โโโ layers.39.attention.w_qkv.0.qweight
โโโ layers.39.attention.w_qkv.0.scales_zeros
โโโ layers.39.attention.wo.0.qweight
โโโ layers.39.attention.wo.0.scales_zeros
โโโ layers.39.attention_norm.weight
โโโ layers.39.feed_forward.w13.0.qweight
โโโ layers.39.feed_forward.w13.0.scales_zeros
โโโ layers.39.feed_forward.w2.0.qweight
โโโ layers.39.feed_forward.w2.0.scales_zeros
โโโ layers.39.ffn_norm.weight
โโโ layers.4.attention.w_qkv.0.qweight
โโโ layers.4.attention.w_qkv.0.scales_zeros
โโโ layers.4.attention.wo.0.qweight
โโโ layers.4.attention.wo.0.scales_zeros
โโโ layers.4.attention_norm.weight
โโโ layers.4.feed_forward.w13.0.qweight
โโโ layers.4.feed_forward.w13.0.scales_zeros
โโโ layers.4.feed_forward.w2.0.qweight
โโโ layers.4.feed_forward.w2.0.scales_zeros
โโโ layers.4.ffn_norm.weight
โโโ layers.40.attention.w_qkv.0.qweight
โโโ layers.40.attention.w_qkv.0.scales_zeros
โโโ layers.40.attention.wo.0.qweight
โโโ layers.40.attention.wo.0.scales_zeros
โโโ layers.40.attention_norm.weight
โโโ layers.40.feed_forward.w13.0.qweight
โโโ layers.40.feed_forward.w13.0.scales_zeros
โโโ layers.40.feed_forward.w2.0.qweight
โโโ layers.40.feed_forward.w2.0.scales_zeros
โโโ layers.40.ffn_norm.weight
โโโ layers.41.attention.w_qkv.0.qweight
โโโ layers.41.attention.w_qkv.0.scales_zeros
โโโ layers.41.attention.wo.0.qweight
โโโ layers.41.attention.wo.0.scales_zeros
โโโ layers.41.attention_norm.weight
โโโ layers.41.feed_forward.w13.0.qweight
โโโ layers.41.feed_forward.w13.0.scales_zeros
โโโ layers.41.feed_forward.w2.0.qweight
โโโ layers.41.feed_forward.w2.0.scales_zeros
โโโ layers.41.ffn_norm.weight
โโโ layers.42.attention.w_qkv.0.qweight
โโโ layers.42.attention.w_qkv.0.scales_zeros
โโโ layers.42.attention.wo.0.qweight
โโโ layers.42.attention.wo.0.scales_zeros
โโโ layers.42.attention_norm.weight
โโโ layers.42.feed_forward.w13.0.qweight
โโโ layers.42.feed_forward.w13.0.scales_zeros
โโโ layers.42.feed_forward.w2.0.qweight
โโโ layers.42.feed_forward.w2.0.scales_zeros
โโโ layers.42.ffn_norm.weight
โโโ layers.43.attention.w_qkv.0.qweight
โโโ layers.43.attention.w_qkv.0.scales_zeros
โโโ layers.43.attention.wo.0.qweight
โโโ layers.43.attention.wo.0.scales_zeros
โโโ layers.43.attention_norm.weight
โโโ layers.43.feed_forward.w13.0.qweight
โโโ layers.43.feed_forward.w13.0.scales_zeros
โโโ layers.43.feed_forward.w2.0.qweight
โโโ layers.43.feed_forward.w2.0.scales_zeros
โโโ layers.43.ffn_norm.weight
โโโ layers.44.attention.w_qkv.0.qweight
โโโ layers.44.attention.w_qkv.0.scales_zeros
โโโ layers.44.attention.wo.0.qweight
โโโ layers.44.attention.wo.0.scales_zeros
โโโ layers.44.attention_norm.weight
โโโ layers.44.feed_forward.w13.0.qweight
โโโ layers.44.feed_forward.w13.0.scales_zeros
โโโ layers.44.feed_forward.w2.0.qweight
โโโ layers.44.feed_forward.w2.0.scales_zeros
โโโ layers.44.ffn_norm.weight
โโโ layers.45.attention.w_qkv.0.qweight
โโโ layers.45.attention.w_qkv.0.scales_zeros
โโโ layers.45.attention.wo.0.qweight
โโโ layers.45.attention.wo.0.scales_zeros
โโโ layers.45.attention_norm.weight
โโโ layers.45.feed_forward.w13.0.qweight
โโโ layers.45.feed_forward.w13.0.scales_zeros
โโโ layers.45.feed_forward.w2.0.qweight
โโโ layers.45.feed_forward.w2.0.scales_zeros
โโโ layers.45.ffn_norm.weight
โโโ layers.46.attention.w_qkv.0.qweight
โโโ layers.46.attention.w_qkv.0.scales_zeros
โโโ layers.46.attention.wo.0.qweight
โโโ layers.46.attention.wo.0.scales_zeros
โโโ layers.46.attention_norm.weight
โโโ layers.46.feed_forward.w13.0.qweight
โโโ layers.46.feed_forward.w13.0.scales_zeros
โโโ layers.46.feed_forward.w2.0.qweight
โโโ layers.46.feed_forward.w2.0.scales_zeros
โโโ layers.46.ffn_norm.weight
โโโ layers.47.attention.w_qkv.0.qweight
โโโ layers.47.attention.w_qkv.0.scales_zeros
โโโ layers.47.attention.wo.0.qweight
โโโ layers.47.attention.wo.0.scales_zeros
โโโ layers.47.attention_norm.weight
โโโ layers.47.feed_forward.w13.0.qweight
โโโ layers.47.feed_forward.w13.0.scales_zeros
โโโ layers.47.feed_forward.w2.0.qweight
โโโ layers.47.feed_forward.w2.0.scales_zeros
โโโ layers.47.ffn_norm.weight
โโโ layers.48.attention.w_qkv.0.qweight
โโโ layers.48.attention.w_qkv.0.scales_zeros
โโโ layers.48.attention.wo.0.qweight
โโโ layers.48.attention.wo.0.scales_zeros
โโโ layers.48.attention_norm.weight
โโโ layers.48.feed_forward.w13.0.qweight
โโโ layers.48.feed_forward.w13.0.scales_zeros
โโโ layers.48.feed_forward.w2.0.qweight
โโโ layers.48.feed_forward.w2.0.scales_zeros
โโโ layers.48.ffn_norm.weight
โโโ layers.49.attention.w_qkv.0.qweight
โโโ layers.49.attention.w_qkv.0.scales_zeros
โโโ layers.49.attention.wo.0.qweight
โโโ layers.49.attention.wo.0.scales_zeros
โโโ layers.49.attention_norm.weight
โโโ layers.49.feed_forward.w13.0.qweight
โโโ layers.49.feed_forward.w13.0.scales_zeros
โโโ layers.49.feed_forward.w2.0.qweight
โโโ layers.49.feed_forward.w2.0.scales_zeros
โโโ layers.49.ffn_norm.weight
โโโ layers.5.attention.w_qkv.0.qweight
โโโ layers.5.attention.w_qkv.0.scales_zeros
โโโ layers.5.attention.wo.0.qweight
โโโ layers.5.attention.wo.0.scales_zeros
โโโ layers.5.attention_norm.weight
โโโ layers.5.feed_forward.w13.0.qweight
โโโ layers.5.feed_forward.w13.0.scales_zeros
โโโ layers.5.feed_forward.w2.0.qweight
โโโ layers.5.feed_forward.w2.0.scales_zeros
โโโ layers.5.ffn_norm.weight
โโโ layers.50.attention.w_qkv.0.qweight
โโโ layers.50.attention.w_qkv.0.scales_zeros
โโโ layers.50.attention.wo.0.qweight
โโโ layers.50.attention.wo.0.scales_zeros
โโโ layers.50.attention_norm.weight
โโโ layers.50.feed_forward.w13.0.qweight
โโโ layers.50.feed_forward.w13.0.scales_zeros
โโโ layers.50.feed_forward.w2.0.qweight
โโโ layers.50.feed_forward.w2.0.scales_zeros
โโโ layers.50.ffn_norm.weight
โโโ layers.51.attention.w_qkv.0.qweight
โโโ layers.51.attention.w_qkv.0.scales_zeros
โโโ layers.51.attention.wo.0.qweight
โโโ layers.51.attention.wo.0.scales_zeros
โโโ layers.51.attention_norm.weight
โโโ layers.51.feed_forward.w13.0.qweight
โโโ layers.51.feed_forward.w13.0.scales_zeros
โโโ layers.51.feed_forward.w2.0.qweight
โโโ layers.51.feed_forward.w2.0.scales_zeros
โโโ layers.51.ffn_norm.weight
โโโ layers.52.attention.w_qkv.0.qweight
โโโ layers.52.attention.w_qkv.0.scales_zeros
โโโ layers.52.attention.wo.0.qweight
โโโ layers.52.attention.wo.0.scales_zeros
โโโ layers.52.attention_norm.weight
โโโ layers.52.feed_forward.w13.0.qweight
โโโ layers.52.feed_forward.w13.0.scales_zeros
โโโ layers.52.feed_forward.w2.0.qweight
โโโ layers.52.feed_forward.w2.0.scales_zeros
โโโ layers.52.ffn_norm.weight
โโโ layers.53.attention.w_qkv.0.qweight
โโโ layers.53.attention.w_qkv.0.scales_zeros
โโโ layers.53.attention.wo.0.qweight
โโโ layers.53.attention.wo.0.scales_zeros
โโโ layers.53.attention_norm.weight
โโโ layers.53.feed_forward.w13.0.qweight
โโโ layers.53.feed_forward.w13.0.scales_zeros
โโโ layers.53.feed_forward.w2.0.qweight
โโโ layers.53.feed_forward.w2.0.scales_zeros
โโโ layers.53.ffn_norm.weight
โโโ layers.54.attention.w_qkv.0.qweight
โโโ layers.54.attention.w_qkv.0.scales_zeros
โโโ layers.54.attention.wo.0.qweight
โโโ layers.54.attention.wo.0.scales_zeros
โโโ layers.54.attention_norm.weight
โโโ layers.54.feed_forward.w13.0.qweight
โโโ layers.54.feed_forward.w13.0.scales_zeros
โโโ layers.54.feed_forward.w2.0.qweight
โโโ layers.54.feed_forward.w2.0.scales_zeros
โโโ layers.54.ffn_norm.weight
โโโ layers.55.attention.w_qkv.0.qweight
โโโ layers.55.attention.w_qkv.0.scales_zeros
โโโ layers.55.attention.wo.0.qweight
โโโ layers.55.attention.wo.0.scales_zeros
โโโ layers.55.attention_norm.weight
โโโ layers.55.feed_forward.w13.0.qweight
โโโ layers.55.feed_forward.w13.0.scales_zeros
โโโ layers.55.feed_forward.w2.0.qweight
โโโ layers.55.feed_forward.w2.0.scales_zeros
โโโ layers.55.ffn_norm.weight
โโโ layers.56.attention.w_qkv.0.qweight
โโโ layers.56.attention.w_qkv.0.scales_zeros
โโโ layers.56.attention.wo.0.qweight
โโโ layers.56.attention.wo.0.scales_zeros
โโโ layers.56.attention_norm.weight
โโโ layers.56.feed_forward.w13.0.qweight
โโโ layers.56.feed_forward.w13.0.scales_zeros
โโโ layers.56.feed_forward.w2.0.qweight
โโโ layers.56.feed_forward.w2.0.scales_zeros
โโโ layers.56.ffn_norm.weight
โโโ layers.57.attention.w_qkv.0.qweight
โโโ layers.57.attention.w_qkv.0.scales_zeros
โโโ layers.57.attention.wo.0.qweight
โโโ layers.57.attention.wo.0.scales_zeros
โโโ layers.57.attention_norm.weight
โโโ layers.57.feed_forward.w13.0.qweight
โโโ layers.57.feed_forward.w13.0.scales_zeros
โโโ layers.57.feed_forward.w2.0.qweight
โโโ layers.57.feed_forward.w2.0.scales_zeros
โโโ layers.57.ffn_norm.weight
โโโ layers.58.attention.w_qkv.0.qweight
โโโ layers.58.attention.w_qkv.0.scales_zeros
โโโ layers.58.attention.wo.0.qweight
โโโ layers.58.attention.wo.0.scales_zeros
โโโ layers.58.attention_norm.weight
โโโ layers.58.feed_forward.w13.0.qweight
โโโ layers.58.feed_forward.w13.0.scales_zeros
โโโ layers.58.feed_forward.w2.0.qweight
โโโ layers.58.feed_forward.w2.0.scales_zeros
โโโ layers.58.ffn_norm.weight
โโโ layers.59.attention.w_qkv.0.qweight
โโโ layers.59.attention.w_qkv.0.scales_zeros
โโโ layers.59.attention.wo.0.qweight
โโโ layers.59.attention.wo.0.scales_zeros
โโโ layers.59.attention_norm.weight
โโโ layers.59.feed_forward.w13.0.qweight
โโโ layers.59.feed_forward.w13.0.scales_zeros
โโโ layers.59.feed_forward.w2.0.qweight
โโโ layers.59.feed_forward.w2.0.scales_zeros
โโโ layers.59.ffn_norm.weight
โโโ layers.6.attention.w_qkv.0.qweight
โโโ layers.6.attention.w_qkv.0.scales_zeros
โโโ layers.6.attention.wo.0.qweight
โโโ layers.6.attention.wo.0.scales_zeros
โโโ layers.6.attention_norm.weight
โโโ layers.6.feed_forward.w13.0.qweight
โโโ layers.6.feed_forward.w13.0.scales_zeros
โโโ layers.6.feed_forward.w2.0.qweight
โโโ layers.6.feed_forward.w2.0.scales_zeros
โโโ layers.6.ffn_norm.weight
โโโ layers.7.attention.w_qkv.0.qweight
โโโ layers.7.attention.w_qkv.0.scales_zeros
โโโ layers.7.attention.wo.0.qweight
โโโ layers.7.attention.wo.0.scales_zeros
โโโ layers.7.attention_norm.weight
โโโ layers.7.feed_forward.w13.0.qweight
โโโ layers.7.feed_forward.w13.0.scales_zeros
โโโ layers.7.feed_forward.w2.0.qweight
โโโ layers.7.feed_forward.w2.0.scales_zeros
โโโ layers.7.ffn_norm.weight
โโโ layers.8.attention.w_qkv.0.qweight
โโโ layers.8.attention.w_qkv.0.scales_zeros
โโโ layers.8.attention.wo.0.qweight
โโโ layers.8.attention.wo.0.scales_zeros
โโโ layers.8.attention_norm.weight
โโโ layers.8.feed_forward.w13.0.qweight
โโโ layers.8.feed_forward.w13.0.scales_zeros
โโโ layers.8.feed_forward.w2.0.qweight
โโโ layers.8.feed_forward.w2.0.scales_zeros
โโโ layers.8.ffn_norm.weight
โโโ layers.9.attention.w_qkv.0.qweight
โโโ layers.9.attention.w_qkv.0.scales_zeros
โโโ layers.9.attention.wo.0.qweight
โโโ layers.9.attention.wo.0.scales_zeros
โโโ layers.9.attention_norm.weight
โโโ layers.9.feed_forward.w13.0.qweight
โโโ layers.9.feed_forward.w13.0.scales_zeros
โโโ layers.9.feed_forward.w2.0.qweight
โโโ layers.9.feed_forward.w2.0.scales_zeros
โโโ layers.9.ffn_norm.weight
โโโ norm.weight
โโโ output.weight
โโโ tok_embeddings.weight
18 directories, 624 files
lmdeploy only support in sm>=80, such as A10, A100, Geforce 30/40 series.
้ๅฐไธdlutsniper็ธๅ็้ฎ้ข๏ผไฝฟ็จ็gpuๆฏ4090๏ผcudaๆฏ11.8.
lmdeploy doesn't support 4bit inference on Tesla V100 32GB.
As presented in https://github.com/InternLM/lmdeploy/blob/main/docs/en/w4a16.md:
"LMDeploy supports LLM model inference of 4-bit weight, with the minimum requirement for NVIDIA graphics cards being sm80, such as A10, A100, Geforce 30/40 series."