ramgpt
/

ramgpt-13b-awq-gemm

Question Answering

4-bit precision

Model card Files Files and versions Community

ramgpt commited on Dec 4, 2023

Commit

d4470f1

•

1 Parent(s): a198b01

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -14,13 +14,13 @@ tags:
 # ramgpt-13b-awq-gemm Model Description
 ## Overview
-This document details the "ramgpt-13b-awq-gemm" model, an innovative implementation that leverages Arbitrary Width Quantization (AWQ) in General Matrix Multiply (GEMM) operations. This model is part of the ramgpt series and is designed for high efficiency in large-scale AI tasks.
 ## Model Specifications
 ### Core Technology
 - **Architecture**: Based on the ramgpt-13b framework.
-- **Quantization Technique**: Utilizes Arbitrary Width Quantization (AWQ) for optimized matrix operations.
 ### Scale and Performance
 - **Model Size**: 13 billion parameters, finely tuned for a balance between performance and computational efficiency.

 # ramgpt-13b-awq-gemm Model Description
 ## Overview
+This document details the "ramgpt-13b-awq-gemm" model, an innovative implementation that leverages Activation-aware Weight Quantization for LLM Compression and Acceleration. This model is part of the ramgpt series and is designed for high efficiency in large-scale AI tasks.
 ## Model Specifications
 ### Core Technology
 - **Architecture**: Based on the ramgpt-13b framework.
+- **Quantization Technique**: Utilizes Activation-aware Weight Quantization for LLM Compression and Acceleration. for optimized matrix operations.
 ### Scale and Performance
 - **Model Size**: 13 billion parameters, finely tuned for a balance between performance and computational efficiency.