Update README.md
Browse files
README.md
CHANGED
@@ -14,13 +14,13 @@ tags:
|
|
14 |
# ramgpt-13b-awq-gemm Model Description
|
15 |
|
16 |
## Overview
|
17 |
-
This document details the "ramgpt-13b-awq-gemm" model, an innovative implementation that leverages
|
18 |
|
19 |
## Model Specifications
|
20 |
|
21 |
### Core Technology
|
22 |
- **Architecture**: Based on the ramgpt-13b framework.
|
23 |
-
- **Quantization Technique**: Utilizes
|
24 |
|
25 |
### Scale and Performance
|
26 |
- **Model Size**: 13 billion parameters, finely tuned for a balance between performance and computational efficiency.
|
|
|
14 |
# ramgpt-13b-awq-gemm Model Description
|
15 |
|
16 |
## Overview
|
17 |
+
This document details the "ramgpt-13b-awq-gemm" model, an innovative implementation that leverages Activation-aware Weight Quantization for LLM Compression and Acceleration. This model is part of the ramgpt series and is designed for high efficiency in large-scale AI tasks.
|
18 |
|
19 |
## Model Specifications
|
20 |
|
21 |
### Core Technology
|
22 |
- **Architecture**: Based on the ramgpt-13b framework.
|
23 |
+
- **Quantization Technique**: Utilizes Activation-aware Weight Quantization for LLM Compression and Acceleration. for optimized matrix operations.
|
24 |
|
25 |
### Scale and Performance
|
26 |
- **Model Size**: 13 billion parameters, finely tuned for a balance between performance and computational efficiency.
|