Stephen
smpanaro
AI & ML interests
Apple Neural Engine, Quantization
Organizations
Collections
6
-
SqueezeLLM: Dense-and-Sparse Quantization
Paper • 2306.07629 • Published • 4 -
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Paper • 2309.02784 • Published • 1 -
Extreme Compression of Large Language Models via Additive Quantization
Paper • 2401.06118 • Published • 12 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 48
models
15
smpanaro/Llama-2-7b-coreml
Updated
•
3
smpanaro/coreml-joint-compression-test
Updated
smpanaro/Llama-2-7b-NuGPTQ
Text Generation
•
Updated
•
13
•
1
smpanaro/whisperkit-coreml
Updated
•
1
smpanaro/pythia-6.9b-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
8
smpanaro/pythia-70m-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
10
smpanaro/pythia-2.8b-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
9
smpanaro/pythia-1.4b-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
7
smpanaro/pythia-1b-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
9
smpanaro/pythia-410m-AutoGPTQ-4bit-128g
Text Generation
•
Updated
•
4
datasets
None public yet