--- base_model: - meta-llama/Meta-Llama-3-8B-Instruct pipeline_tag: text-generation metrics: - accuracy datasets: - allenai/c4 --- # Model Description: Pruned from [`meta-llama/Meta-Llama-3-8B-Instruct`](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) using the LLM-Pruner from [`LLM-Pruner: On the Structural Pruning of Large Language Models`](https://arxiv.org/abs/2305.11627) Done to test viability of LLM-Pruner for task-agnostic, low resource Generative AI for Commercial and Personal Use compared to using out-of-the-box models like [`meta-llama/Llama-3.2-3B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) Our presentation slides may be found [here](https://drive.google.com/file/d/1SUGGgOAq-mizqwM_KveBQ2pWdyglPVdM/view?usp=sharing) # To replicate, 1. First, clone the [official implementation](https://github.com/horseee/LLM-Pruner) and run: ``` python llama3.py --pruning_ratio 0.25 \ --device cuda --eval_device cuda \ --base_model meta-llama/Meta-Llama-3-8B-Instruct \ --block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \ --block_attention_layer_start 4 --block_attention_layer_end 30 \ --save_ckpt_log_name llama3_prune \ --pruner_type taylor --taylor param_first \ --max_seq_len 512 \ --test_after_train --test_before_train --save_model ``` to get the pruned model. **NOTE**: - We removed `'ptb'` from the datasets in `llama3.py` since it requires foreign code to load. - We change `get_examples` in `llama3.py` to use `'c4'` since bookcorpus requires foreign code to load. 2. Then, to post-train, follow the official implementation, [section 2](https://github.com/horseee/LLM-Pruner?tab=readme-ov-file#2-post-training-recover-stage) # Benchmark Results **Benchmark Evaluation**: The model follows the original paper's evaluation and perform zero-shot task classification on 5 common sense reasoning datasets that doesn't require foreign code to load: | Model | BoolQ | HellaSwag | ARC-e | ARC-c | OBQA | Average Accuracy | |------------------------------|--------|-----------|--------|--------|-------|-------------------| | **Llama-3-6.6B-LLM-Pruned** | 70.86 | 67.64 | 73.82 | 44.28 | 37.6 | 58.84 | # Usage: Follow the official implementation for usage, [section `Pruned Model with Post-Training`](https://github.com/horseee/LLM-Pruner?tab=readme-ov-file#2-post-training-recover-stage).