twhoool02 commited on
Commit
5057a02
1 Parent(s): 35b5b0e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -26
README.md CHANGED
@@ -18,32 +18,7 @@ pipeline_tag: text-generation
18
  qunatized_by: twhoool02
19
  ---
20
 
21
- # Model Card for LlamaAWQForCausalLM(
22
- (model): LlamaForCausalLM(
23
- (model): LlamaLikeModel(
24
- (embedding): Embedding(32000, 4096)
25
- (blocks): ModuleList(
26
- (0-31): 32 x LlamaLikeBlock(
27
- (norm_1): FasterTransformerRMSNorm()
28
- (attn): QuantAttentionFused(
29
- (qkv_proj): WQLinear_GEMM(in_features=4096, out_features=12288, bias=False, w_bit=4, group_size=128)
30
- (o_proj): WQLinear_GEMM(in_features=4096, out_features=4096, bias=False, w_bit=4, group_size=128)
31
- (rope): RoPE()
32
- )
33
- (norm_2): FasterTransformerRMSNorm()
34
- (mlp): LlamaMLP(
35
- (gate_proj): WQLinear_GEMM(in_features=4096, out_features=11008, bias=False, w_bit=4, group_size=128)
36
- (up_proj): WQLinear_GEMM(in_features=4096, out_features=11008, bias=False, w_bit=4, group_size=128)
37
- (down_proj): WQLinear_GEMM(in_features=11008, out_features=4096, bias=False, w_bit=4, group_size=128)
38
- (act_fn): SiLU()
39
- )
40
- )
41
- )
42
- (norm): LlamaRMSNorm()
43
- )
44
- (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
45
- )
46
- )
47
 
48
  <!-- Provide a quick summary of what the model is/does. -->
49
 
 
18
  qunatized_by: twhoool02
19
  ---
20
 
21
+ # Model Card for Llama-2-7b-hf-AWQ
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  <!-- Provide a quick summary of what the model is/does. -->
24