Charlie Ruan

Add converted weights

e75f3ac 4 months ago

183 kB

	/Users/cfruan/miniconda3/envs/mlc-chat-venv/bin/python -m mlc_llm gen_config /Users/Shared/models/Meta-Llama-3.1-70B-Instruct --quantization q0f16 --conv-template llama-3_1 --output local_dir/Llama-3.1-70B-Instruct-q0f16-MLC
	[2024-07-23 17:43:51] INFO auto_config.py:116: [92mFound[0m model configuration: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/config.json
	[2024-07-23 17:43:51] INFO auto_config.py:154: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override.
	[2024-07-23 17:43:51] INFO llama_model.py:62: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (131072)
	[2024-07-23 17:43:51] INFO llama_model.py:82: [1mprefill_chunk_size[0m defaults to 2048
	[2024-07-23 17:43:51] INFO config.py:107: Overriding [1mmax_batch_size[0m from 1 to 80
	[2024-07-23 17:43:51] INFO gen_config.py:144: [generation_config.json] Setting [1mbos_token_id[0m: 128000
	[2024-07-23 17:43:51] INFO gen_config.py:144: [generation_config.json] Setting [1meos_token_id[0m: [128001, 128008, 128009]
	[2024-07-23 17:43:51] INFO gen_config.py:144: [generation_config.json] Setting [1mtemperature[0m: 0.6
	[2024-07-23 17:43:51] INFO gen_config.py:144: [generation_config.json] Setting [1mtop_p[0m: 0.9
	[2024-07-23 17:43:51] INFO gen_config.py:158: [91mNot found[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/tokenizer.model
	[2024-07-23 17:43:51] INFO gen_config.py:156: [92mFound[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/tokenizer.json. Copying to [1mlocal_dir/Llama-3.1-70B-Instruct-q0f16-MLC/tokenizer.json[0m
	[2024-07-23 17:43:51] INFO gen_config.py:158: [91mNot found[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/vocab.json
	[2024-07-23 17:43:51] INFO gen_config.py:158: [91mNot found[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/merges.txt
	[2024-07-23 17:43:51] INFO gen_config.py:158: [91mNot found[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/added_tokens.json
	[2024-07-23 17:43:51] INFO gen_config.py:156: [92mFound[0m tokenizer config: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/tokenizer_config.json. Copying to [1mlocal_dir/Llama-3.1-70B-Instruct-q0f16-MLC/tokenizer_config.json[0m
	[2024-07-23 17:43:51] INFO gen_config.py:217: Detected tokenizer info: {'token_postproc_method': 'byte_level', 'prepend_space_in_encode': False, 'strip_space_in_decode': False}
	[2024-07-23 17:43:51] INFO gen_config.py:32: [System default] Setting [1mpad_token_id[0m: 0
	[2024-07-23 17:43:51] INFO gen_config.py:32: [System default] Setting [1mpresence_penalty[0m: 0.0
	[2024-07-23 17:43:51] INFO gen_config.py:32: [System default] Setting [1mfrequency_penalty[0m: 0.0
	[2024-07-23 17:43:51] INFO gen_config.py:32: [System default] Setting [1mrepetition_penalty[0m: 1.0
	[2024-07-23 17:43:51] INFO gen_config.py:245: Dumping configuration file to: [1mlocal_dir/Llama-3.1-70B-Instruct-q0f16-MLC/mlc-chat-config.json[0m
	/Users/cfruan/miniconda3/envs/mlc-chat-venv/bin/python -m mlc_llm convert_weight /Users/Shared/models/Meta-Llama-3.1-70B-Instruct --quantization q0f16 --output local_dir/Llama-3.1-70B-Instruct-q0f16-MLC
	[2024-07-23 17:43:52] INFO auto_config.py:116: [92mFound[0m model configuration: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/config.json
	[2024-07-23 17:43:52] INFO auto_device.py:88: [91mNot found[0m device: cuda:0
	[2024-07-23 17:43:53] INFO auto_device.py:88: [91mNot found[0m device: rocm:0
	[2024-07-23 17:43:54] INFO auto_device.py:79: [92mFound[0m device: metal:0
	[2024-07-23 17:43:55] INFO auto_device.py:88: [91mNot found[0m device: vulkan:0
	[2024-07-23 17:43:55] INFO auto_device.py:88: [91mNot found[0m device: opencl:0
	[2024-07-23 17:43:55] INFO auto_device.py:35: Using device: [1mmetal:0[0m
	[2024-07-23 17:43:55] INFO auto_weight.py:71: Finding weights in: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct
	[2024-07-23 17:43:55] INFO auto_weight.py:137: [91mNot found[0m Huggingface PyTorch
	[2024-07-23 17:43:55] INFO auto_weight.py:144: [92mFound[0m source weight format: huggingface-safetensor. Source configuration: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model.safetensors.index.json
	[2024-07-23 17:43:55] INFO auto_weight.py:107: Using source weight configuration: [1m/Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model.safetensors.index.json[0m. Use `--source` to override.
	[2024-07-23 17:43:55] INFO auto_weight.py:111: Using source weight format: [1mhuggingface-safetensor[0m. Use `--source-format` to override.
	[2024-07-23 17:43:55] INFO auto_config.py:154: [92mFound[0m model type: [1mllama[0m. Use `--model-type` to override.
	[2024-07-23 17:43:55] INFO llama_model.py:62: [1mcontext_window_size[0m not found in config.json. Falling back to [1mmax_position_embeddings[0m (131072)
	[2024-07-23 17:43:55] INFO llama_model.py:82: [1mprefill_chunk_size[0m defaults to 2048
	[1mWeight conversion with arguments:[0m
	[1m--config[0m /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/config.json
	[1m--quantization[0m NoQuantize(name='q0f16', kind='no-quant', model_dtype='float16')
	[1m--model-type[0m llama
	[1m--device[0m metal:0
	[1m--source[0m /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model.safetensors.index.json
	[1m--source-format[0m huggingface-safetensor
	[1m--output[0m local_dir/Llama-3.1-70B-Instruct-q0f16-MLC
	Start storing to cache local_dir/Llama-3.1-70B-Instruct-q0f16-MLC
	0%\| \| 0/483 [00:00<?, ?it/s] [2024-07-23 17:44:00] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00030-of-00030.safetensors
	0%\| \| 0/483 [00:00<?, ?it/s] [2024-07-23 17:44:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mlm_head.weight[0m", shape: (128256, 8192), dtype: float16
	0%\| \| 0/483 [00:04<?, ?it/s] 0%\| \| 1/483 [00:08<1:09:33, 8.66s/it] [2024-07-23 17:44:09] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00030-of-00030.safetensors
	0%\| \| 1/483 [00:08<1:09:33, 8.66s/it] [2024-07-23 17:44:09] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00001-of-00030.safetensors
	0%\| \| 1/483 [00:08<1:09:33, 8.66s/it] [2024-07-23 17:44:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.embed_tokens.weight[0m", shape: (128256, 8192), dtype: float16
	0%\| \| 1/483 [00:15<1:09:33, 8.66s/it] 0%\| \| 2/483 [00:19<1:20:25, 10.03s/it] [2024-07-23 17:44:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	0%\| \| 2/483 [00:19<1:20:25, 10.03s/it] 1%\| \| 3/483 [00:19<44:16, 5.53s/it] [2024-07-23 17:44:21] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	1%\| \| 3/483 [00:20<44:16, 5.53s/it] 1%\| \| 4/483 [00:21<31:43, 3.97s/it] [2024-07-23 17:44:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	1%\| \| 4/483 [00:23<31:43, 3.97s/it] 1%\| \| 5/483 [00:25<30:36, 3.84s/it] [2024-07-23 17:44:25] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	1%\| \| 5/483 [00:25<30:36, 3.84s/it] 1%\| \| 6/483 [00:25<20:27, 2.57s/it] [2024-07-23 17:44:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	1%\| \| 6/483 [00:25<20:27, 2.57s/it] 1%\|▏ \| 7/483 [00:25<15:19, 1.93s/it] [2024-07-23 17:44:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.0.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	1%\|▏ \| 7/483 [00:25<15:19, 1.93s/it] 2%\|▏ \| 8/483 [00:26<11:29, 1.45s/it] [2024-07-23 17:44:26] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00002-of-00030.safetensors
	2%\|▏ \| 8/483 [00:26<11:29, 1.45s/it] [2024-07-23 17:44:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	2%\|▏ \| 8/483 [00:32<11:29, 1.45s/it] 2%\|▏ \| 9/483 [00:34<27:20, 3.46s/it] [2024-07-23 17:44:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	2%\|▏ \| 9/483 [00:34<27:20, 3.46s/it] 2%\|▏ \| 10/483 [00:34<20:33, 2.61s/it] [2024-07-23 17:44:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	2%\|▏ \| 10/483 [00:34<20:33, 2.61s/it] 2%\|▏ \| 11/483 [00:35<15:14, 1.94s/it] [2024-07-23 17:44:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	2%\|▏ \| 11/483 [00:35<15:14, 1.94s/it] [2024-07-23 17:44:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	2%\|▏ \| 11/483 [00:35<15:14, 1.94s/it] 3%\|▎ \| 13/483 [00:36<10:53, 1.39s/it] [2024-07-23 17:44:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.1.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	3%\|▎ \| 13/483 [00:36<10:53, 1.39s/it] [2024-07-23 17:44:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	3%\|▎ \| 13/483 [00:36<10:53, 1.39s/it] [2024-07-23 17:44:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	3%\|▎ \| 13/483 [00:37<10:53, 1.39s/it] 3%\|▎ \| 16/483 [00:38<07:23, 1.05it/s] [2024-07-23 17:44:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	3%\|▎ \| 16/483 [00:39<07:23, 1.05it/s] 4%\|▎ \| 17/483 [00:41<11:34, 1.49s/it] [2024-07-23 17:44:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	4%\|▎ \| 17/483 [00:42<11:34, 1.49s/it] 4%\|▎ \| 18/483 [00:42<09:15, 1.19s/it] [2024-07-23 17:44:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	4%\|▎ \| 18/483 [00:42<09:15, 1.19s/it] 4%\|▍ \| 19/483 [00:42<08:04, 1.04s/it] [2024-07-23 17:44:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.2.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	4%\|▍ \| 19/483 [00:42<08:04, 1.04s/it] 4%\|▍ \| 20/483 [00:43<06:48, 1.13it/s] [2024-07-23 17:44:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	4%\|▍ \| 20/483 [00:43<06:48, 1.13it/s] [2024-07-23 17:44:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	4%\|▍ \| 20/483 [00:43<06:48, 1.13it/s] 5%\|▍ \| 22/483 [00:44<06:20, 1.21it/s] [2024-07-23 17:44:46] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	5%\|▍ \| 22/483 [00:46<06:20, 1.21it/s] 5%\|▍ \| 23/483 [00:48<11:03, 1.44s/it] [2024-07-23 17:44:48] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	5%\|▍ \| 23/483 [00:48<11:03, 1.44s/it] 5%\|▍ \| 24/483 [00:48<08:30, 1.11s/it] [2024-07-23 17:44:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	5%\|▍ \| 24/483 [00:48<08:30, 1.11s/it] 5%\|▌ \| 25/483 [00:48<07:23, 1.03it/s] [2024-07-23 17:44:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.3.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	5%\|▌ \| 25/483 [00:48<07:23, 1.03it/s] 5%\|▌ \| 26/483 [00:49<06:12, 1.23it/s] [2024-07-23 17:44:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	5%\|▌ \| 26/483 [00:49<06:12, 1.23it/s] 6%\|▌ \| 27/483 [00:49<05:38, 1.35it/s] [2024-07-23 17:44:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	6%\|▌ \| 27/483 [00:49<05:38, 1.35it/s] 6%\|▌ \| 28/483 [00:50<04:56, 1.54it/s] [2024-07-23 17:44:50] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00001-of-00030.safetensors
	6%\|▌ \| 28/483 [00:50<04:56, 1.54it/s] [2024-07-23 17:44:50] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00002-of-00030.safetensors
	6%\|▌ \| 28/483 [00:50<04:56, 1.54it/s] [2024-07-23 17:44:51] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00005-of-00030.safetensors
	6%\|▌ \| 28/483 [00:50<04:56, 1.54it/s] [2024-07-23 17:44:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	6%\|▌ \| 28/483 [00:52<04:56, 1.54it/s] 6%\|▌ \| 29/483 [00:52<08:34, 1.13s/it] [2024-07-23 17:44:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	6%\|▌ \| 29/483 [00:53<08:34, 1.13s/it] 6%\|▌ \| 30/483 [00:53<09:23, 1.24s/it] [2024-07-23 17:44:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	6%\|▌ \| 30/483 [00:55<09:23, 1.24s/it] 6%\|▋ \| 31/483 [00:57<14:28, 1.92s/it] [2024-07-23 17:44:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	6%\|▋ \| 31/483 [00:57<14:28, 1.92s/it] 7%\|▋ \| 32/483 [00:57<10:24, 1.38s/it] [2024-07-23 17:44:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	7%\|▋ \| 32/483 [00:57<10:24, 1.38s/it] 7%\|▋ \| 33/483 [00:58<08:31, 1.14s/it] [2024-07-23 17:44:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.10.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	7%\|▋ \| 33/483 [00:58<08:31, 1.14s/it] 7%\|▋ \| 34/483 [00:58<06:51, 1.09it/s] [2024-07-23 17:44:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	7%\|▋ \| 34/483 [00:58<06:51, 1.09it/s] [2024-07-23 17:44:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	7%\|▋ \| 34/483 [00:59<06:51, 1.09it/s] 7%\|▋ \| 36/483 [01:00<06:15, 1.19it/s] [2024-07-23 17:45:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	7%\|▋ \| 36/483 [01:01<06:15, 1.19it/s] 8%\|▊ \| 37/483 [01:03<10:53, 1.46s/it] [2024-07-23 17:45:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	8%\|▊ \| 37/483 [01:03<10:53, 1.46s/it] 8%\|▊ \| 38/483 [01:03<08:14, 1.11s/it] [2024-07-23 17:45:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	8%\|▊ \| 38/483 [01:03<08:14, 1.11s/it] 8%\|▊ \| 39/483 [01:04<07:03, 1.05it/s] [2024-07-23 17:45:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.11.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	8%\|▊ \| 39/483 [01:04<07:03, 1.05it/s] 8%\|▊ \| 40/483 [01:04<05:53, 1.25it/s] [2024-07-23 17:45:06] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	8%\|▊ \| 40/483 [01:05<05:53, 1.25it/s] 8%\|▊ \| 41/483 [01:07<11:09, 1.52s/it] [2024-07-23 17:45:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	8%\|▊ \| 41/483 [01:08<11:09, 1.52s/it] 9%\|▊ \| 42/483 [01:08<09:18, 1.27s/it] [2024-07-23 17:45:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	9%\|▊ \| 42/483 [01:08<09:18, 1.27s/it] 9%\|▉ \| 43/483 [01:08<07:26, 1.01s/it] [2024-07-23 17:45:09] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00005-of-00030.safetensors
	9%\|▉ \| 43/483 [01:08<07:26, 1.01s/it] [2024-07-23 17:45:09] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00006-of-00030.safetensors
	9%\|▉ \| 43/483 [01:08<07:26, 1.01s/it] [2024-07-23 17:45:11] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	9%\|▉ \| 43/483 [01:11<07:26, 1.01s/it] 9%\|▉ \| 44/483 [01:11<10:20, 1.41s/it] [2024-07-23 17:45:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	9%\|▉ \| 44/483 [01:11<10:20, 1.41s/it] 9%\|▉ \| 45/483 [01:12<10:31, 1.44s/it] [2024-07-23 17:45:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.12.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	9%\|▉ \| 45/483 [01:12<10:31, 1.44s/it] [2024-07-23 17:45:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	9%\|▉ \| 45/483 [01:12<10:31, 1.44s/it] [2024-07-23 17:45:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	9%\|▉ \| 45/483 [01:13<10:31, 1.44s/it] 10%\|▉ \| 48/483 [01:14<06:40, 1.09it/s] [2024-07-23 17:45:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	10%\|▉ \| 48/483 [01:15<06:40, 1.09it/s] 10%\|█ \| 49/483 [01:17<10:36, 1.47s/it] [2024-07-23 17:45:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	10%\|█ \| 49/483 [01:17<10:36, 1.47s/it] 10%\|█ \| 50/483 [01:17<08:19, 1.15s/it] [2024-07-23 17:45:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	10%\|█ \| 50/483 [01:18<08:19, 1.15s/it] 11%\|█ \| 51/483 [01:18<07:12, 1.00s/it] [2024-07-23 17:45:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.13.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	11%\|█ \| 51/483 [01:18<07:12, 1.00s/it] 11%\|█ \| 52/483 [01:18<06:03, 1.19it/s] [2024-07-23 17:45:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	11%\|█ \| 52/483 [01:18<06:03, 1.19it/s] [2024-07-23 17:45:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	11%\|█ \| 52/483 [01:19<06:03, 1.19it/s] 11%\|█ \| 54/483 [01:20<05:43, 1.25it/s] [2024-07-23 17:45:22] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	11%\|█ \| 54/483 [01:21<05:43, 1.25it/s] 11%\|█▏ \| 55/483 [01:23<10:08, 1.42s/it] [2024-07-23 17:45:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	11%\|█▏ \| 55/483 [01:23<10:08, 1.42s/it] 12%\|█▏ \| 56/483 [01:23<07:45, 1.09s/it] [2024-07-23 17:45:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	12%\|█▏ \| 56/483 [01:24<07:45, 1.09s/it] 12%\|█▏ \| 57/483 [01:24<06:41, 1.06it/s] [2024-07-23 17:45:25] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.14.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	12%\|█▏ \| 57/483 [01:24<06:41, 1.06it/s] 12%\|█▏ \| 58/483 [01:24<05:37, 1.26it/s] [2024-07-23 17:45:25] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00007-of-00030.safetensors
	12%\|█▏ \| 58/483 [01:24<05:37, 1.26it/s] [2024-07-23 17:45:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	12%\|█▏ \| 58/483 [01:28<05:37, 1.26it/s] 12%\|█▏ \| 59/483 [01:30<14:42, 2.08s/it] [2024-07-23 17:45:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	12%\|█▏ \| 59/483 [01:30<14:42, 2.08s/it] 12%\|█▏ \| 60/483 [01:30<11:49, 1.68s/it] [2024-07-23 17:45:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	12%\|█▏ \| 60/483 [01:30<11:49, 1.68s/it] 13%\|█▎ \| 61/483 [01:31<09:11, 1.31s/it] [2024-07-23 17:45:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	13%\|█▎ \| 61/483 [01:31<09:11, 1.31s/it] [2024-07-23 17:45:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	13%\|█▎ \| 61/483 [01:31<09:11, 1.31s/it] 13%\|█▎ \| 63/483 [01:32<07:24, 1.06s/it] [2024-07-23 17:45:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.15.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	13%\|█▎ \| 63/483 [01:32<07:24, 1.06s/it] [2024-07-23 17:45:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	13%\|█▎ \| 63/483 [01:32<07:24, 1.06s/it] [2024-07-23 17:45:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	13%\|█▎ \| 63/483 [01:33<07:24, 1.06s/it] 14%\|█▎ \| 66/483 [01:34<05:30, 1.26it/s] [2024-07-23 17:45:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	14%\|█▎ \| 66/483 [01:35<05:30, 1.26it/s] 14%\|█▍ \| 67/483 [01:37<09:03, 1.31s/it] [2024-07-23 17:45:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	14%\|█▍ \| 67/483 [01:37<09:03, 1.31s/it] 14%\|█▍ \| 68/483 [01:37<07:16, 1.05s/it] [2024-07-23 17:45:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	14%\|█▍ \| 68/483 [01:38<07:16, 1.05s/it] 14%\|█▍ \| 69/483 [01:38<06:25, 1.08it/s] [2024-07-23 17:45:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.16.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	14%\|█▍ \| 69/483 [01:38<06:25, 1.08it/s] 14%\|█▍ \| 70/483 [01:38<05:29, 1.25it/s] [2024-07-23 17:45:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	14%\|█▍ \| 70/483 [01:38<05:29, 1.25it/s] [2024-07-23 17:45:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	14%\|█▍ \| 70/483 [01:39<05:29, 1.25it/s] 15%\|█▍ \| 72/483 [01:40<05:19, 1.29it/s] [2024-07-23 17:45:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	15%\|█▍ \| 72/483 [01:41<05:19, 1.29it/s] 15%\|█▌ \| 73/483 [01:43<09:23, 1.37s/it] [2024-07-23 17:45:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	15%\|█▌ \| 73/483 [01:43<09:23, 1.37s/it] 15%\|█▌ \| 74/483 [01:43<07:14, 1.06s/it] [2024-07-23 17:45:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	15%\|█▌ \| 74/483 [01:44<07:14, 1.06s/it] 16%\|█▌ \| 75/483 [01:44<06:17, 1.08it/s] [2024-07-23 17:45:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.17.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	16%\|█▌ \| 75/483 [01:44<06:17, 1.08it/s] 16%\|█▌ \| 76/483 [01:44<05:18, 1.28it/s] [2024-07-23 17:45:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	16%\|█▌ \| 76/483 [01:44<05:18, 1.28it/s] 16%\|█▌ \| 77/483 [01:45<04:49, 1.40it/s] [2024-07-23 17:45:46] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	16%\|█▌ \| 77/483 [01:45<04:49, 1.40it/s] 16%\|█▌ \| 78/483 [01:45<04:12, 1.60it/s] [2024-07-23 17:45:46] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00006-of-00030.safetensors
	16%\|█▌ \| 78/483 [01:45<04:12, 1.60it/s] [2024-07-23 17:45:46] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00007-of-00030.safetensors
	16%\|█▌ \| 78/483 [01:45<04:12, 1.60it/s] [2024-07-23 17:45:46] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00008-of-00030.safetensors
	16%\|█▌ \| 78/483 [01:45<04:12, 1.60it/s] [2024-07-23 17:45:48] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	16%\|█▌ \| 78/483 [01:47<04:12, 1.60it/s] 16%\|█▋ \| 79/483 [01:47<07:27, 1.11s/it] [2024-07-23 17:45:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	16%\|█▋ \| 79/483 [01:48<07:27, 1.11s/it] 17%\|█▋ \| 80/483 [01:49<08:13, 1.22s/it] [2024-07-23 17:45:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	17%\|█▋ \| 80/483 [01:51<08:13, 1.22s/it] 17%\|█▋ \| 81/483 [01:52<12:37, 1.88s/it] [2024-07-23 17:45:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.18.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	17%\|█▋ \| 81/483 [01:52<12:37, 1.88s/it] 17%\|█▋ \| 82/483 [01:53<09:04, 1.36s/it] [2024-07-23 17:45:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	17%\|█▋ \| 82/483 [01:53<09:04, 1.36s/it] [2024-07-23 17:45:54] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	17%\|█▋ \| 82/483 [01:53<09:04, 1.36s/it] 17%\|█▋ \| 84/483 [01:54<07:09, 1.08s/it] [2024-07-23 17:45:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	17%\|█▋ \| 84/483 [01:56<07:09, 1.08s/it] 18%\|█▊ \| 85/483 [01:57<11:01, 1.66s/it] [2024-07-23 17:45:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	18%\|█▊ \| 85/483 [01:58<11:01, 1.66s/it] 18%\|█▊ \| 86/483 [01:58<08:19, 1.26s/it] [2024-07-23 17:45:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	18%\|█▊ \| 86/483 [01:58<08:19, 1.26s/it] 18%\|█▊ \| 87/483 [01:58<07:01, 1.06s/it] [2024-07-23 17:45:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.19.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	18%\|█▊ \| 87/483 [01:58<07:01, 1.06s/it] 18%\|█▊ \| 88/483 [01:59<05:49, 1.13it/s] [2024-07-23 17:45:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	18%\|█▊ \| 88/483 [01:59<05:49, 1.13it/s] [2024-07-23 17:46:00] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	18%\|█▊ \| 88/483 [01:59<05:49, 1.13it/s] 19%\|█▊ \| 90/483 [02:00<05:23, 1.22it/s] [2024-07-23 17:46:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	19%\|█▊ \| 90/483 [02:02<05:23, 1.22it/s] 19%\|█▉ \| 91/483 [02:04<09:31, 1.46s/it] [2024-07-23 17:46:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	19%\|█▉ \| 91/483 [02:04<09:31, 1.46s/it] 19%\|█▉ \| 92/483 [02:04<07:14, 1.11s/it] [2024-07-23 17:46:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	19%\|█▉ \| 92/483 [02:04<07:14, 1.11s/it] 19%\|█▉ \| 93/483 [02:04<06:13, 1.04it/s] [2024-07-23 17:46:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.20.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	19%\|█▉ \| 93/483 [02:04<06:13, 1.04it/s] 19%\|█▉ \| 94/483 [02:05<05:16, 1.23it/s] [2024-07-23 17:46:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	19%\|█▉ \| 94/483 [02:05<05:16, 1.23it/s] 20%\|█▉ \| 95/483 [02:05<04:45, 1.36it/s] [2024-07-23 17:46:06] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00008-of-00030.safetensors
	20%\|█▉ \| 95/483 [02:05<04:45, 1.36it/s] [2024-07-23 17:46:06] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00009-of-00030.safetensors
	20%\|█▉ \| 95/483 [02:05<04:45, 1.36it/s] [2024-07-23 17:46:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	20%\|█▉ \| 95/483 [02:08<04:45, 1.36it/s] 20%\|█▉ \| 96/483 [02:08<08:04, 1.25s/it] [2024-07-23 17:46:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	20%\|█▉ \| 96/483 [02:08<08:04, 1.25s/it] 20%\|██ \| 97/483 [02:09<08:38, 1.34s/it] [2024-07-23 17:46:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	20%\|██ \| 97/483 [02:11<08:38, 1.34s/it] 20%\|██ \| 98/483 [02:13<13:13, 2.06s/it] [2024-07-23 17:46:14] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	20%\|██ \| 98/483 [02:13<13:13, 2.06s/it] 20%\|██ \| 99/483 [02:13<09:29, 1.48s/it] [2024-07-23 17:46:14] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.21.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	20%\|██ \| 99/483 [02:13<09:29, 1.48s/it] 21%\|██ \| 100/483 [02:14<07:24, 1.16s/it] [2024-07-23 17:46:14] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	21%\|██ \| 100/483 [02:14<07:24, 1.16s/it] [2024-07-23 17:46:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	21%\|██ \| 100/483 [02:14<07:24, 1.16s/it] 21%\|██ \| 102/483 [02:15<06:09, 1.03it/s] [2024-07-23 17:46:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	21%\|██ \| 102/483 [02:17<06:09, 1.03it/s] 21%\|██▏ \| 103/483 [02:19<10:12, 1.61s/it] [2024-07-23 17:46:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	21%\|██▏ \| 103/483 [02:19<10:12, 1.61s/it] 22%\|██▏ \| 104/483 [02:19<07:42, 1.22s/it] [2024-07-23 17:46:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	22%\|██▏ \| 104/483 [02:19<07:42, 1.22s/it] 22%\|██▏ \| 105/483 [02:19<06:31, 1.03s/it] [2024-07-23 17:46:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.22.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	22%\|██▏ \| 105/483 [02:19<06:31, 1.03s/it] 22%\|██▏ \| 106/483 [02:20<05:22, 1.17it/s] [2024-07-23 17:46:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	22%\|██▏ \| 106/483 [02:20<05:22, 1.17it/s] [2024-07-23 17:46:21] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	22%\|██▏ \| 106/483 [02:20<05:22, 1.17it/s] 22%\|██▏ \| 108/483 [02:21<05:02, 1.24it/s] [2024-07-23 17:46:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	22%\|██▏ \| 108/483 [02:23<05:02, 1.24it/s] 23%\|██▎ \| 109/483 [02:25<09:08, 1.47s/it] [2024-07-23 17:46:25] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	23%\|██▎ \| 109/483 [02:25<09:08, 1.47s/it] 23%\|██▎ \| 110/483 [02:25<06:57, 1.12s/it] [2024-07-23 17:46:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	23%\|██▎ \| 110/483 [02:25<06:57, 1.12s/it] 23%\|██▎ \| 111/483 [02:25<05:57, 1.04it/s] [2024-07-23 17:46:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.23.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	23%\|██▎ \| 111/483 [02:25<05:57, 1.04it/s] 23%\|██▎ \| 112/483 [02:26<04:58, 1.24it/s] [2024-07-23 17:46:26] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00009-of-00030.safetensors
	23%\|██▎ \| 112/483 [02:26<04:58, 1.24it/s] [2024-07-23 17:46:27] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00010-of-00030.safetensors
	23%\|██▎ \| 112/483 [02:26<04:58, 1.24it/s] [2024-07-23 17:46:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	23%\|██▎ \| 112/483 [02:28<04:58, 1.24it/s] 23%\|██▎ \| 113/483 [02:28<07:42, 1.25s/it] [2024-07-23 17:46:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	23%\|██▎ \| 113/483 [02:29<07:42, 1.25s/it] 24%\|██▎ \| 114/483 [02:30<08:11, 1.33s/it] [2024-07-23 17:46:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	24%\|██▎ \| 114/483 [02:31<08:11, 1.33s/it] 24%\|██▍ \| 115/483 [02:33<12:30, 2.04s/it] [2024-07-23 17:46:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	24%\|██▍ \| 115/483 [02:33<12:30, 2.04s/it] 24%\|██▍ \| 116/483 [02:33<09:00, 1.47s/it] [2024-07-23 17:46:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	24%\|██▍ \| 116/483 [02:34<09:00, 1.47s/it] 24%\|██▍ \| 117/483 [02:34<07:17, 1.19s/it] [2024-07-23 17:46:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.24.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	24%\|██▍ \| 117/483 [02:34<07:17, 1.19s/it] 24%\|██▍ \| 118/483 [02:34<05:49, 1.04it/s] [2024-07-23 17:46:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	24%\|██▍ \| 118/483 [02:34<05:49, 1.04it/s] [2024-07-23 17:46:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	24%\|██▍ \| 118/483 [02:35<05:49, 1.04it/s] 25%\|██▍ \| 120/483 [02:36<05:13, 1.16it/s] [2024-07-23 17:46:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	25%\|██▍ \| 120/483 [02:37<05:13, 1.16it/s] 25%\|██▌ \| 121/483 [02:39<09:11, 1.52s/it] [2024-07-23 17:46:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	25%\|██▌ \| 121/483 [02:39<09:11, 1.52s/it] 25%\|██▌ \| 122/483 [02:40<06:56, 1.15s/it] [2024-07-23 17:46:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	25%\|██▌ \| 122/483 [02:40<06:56, 1.15s/it] 25%\|██▌ \| 123/483 [02:40<05:54, 1.02it/s] [2024-07-23 17:46:41] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.25.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	25%\|██▌ \| 123/483 [02:40<05:54, 1.02it/s] 26%\|██▌ \| 124/483 [02:40<04:54, 1.22it/s] [2024-07-23 17:46:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	26%\|██▌ \| 124/483 [02:42<04:54, 1.22it/s] 26%\|██▌ \| 125/483 [02:44<09:30, 1.59s/it] [2024-07-23 17:46:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	26%\|██▌ \| 125/483 [02:44<09:30, 1.59s/it] 26%\|██▌ \| 126/483 [02:45<07:51, 1.32s/it] [2024-07-23 17:46:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	26%\|██▌ \| 126/483 [02:45<07:51, 1.32s/it] 26%\|██▋ \| 127/483 [02:45<06:14, 1.05s/it] [2024-07-23 17:46:46] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00010-of-00030.safetensors
	26%\|██▋ \| 127/483 [02:45<06:14, 1.05s/it] [2024-07-23 17:46:46] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00011-of-00030.safetensors
	26%\|██▋ \| 127/483 [02:45<06:14, 1.05s/it] [2024-07-23 17:46:48] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	26%\|██▋ \| 127/483 [02:48<06:14, 1.05s/it] 27%\|██▋ \| 128/483 [02:48<08:47, 1.48s/it] [2024-07-23 17:46:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	27%\|██▋ \| 128/483 [02:48<08:47, 1.48s/it] 27%\|██▋ \| 129/483 [02:49<08:51, 1.50s/it] [2024-07-23 17:46:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.26.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	27%\|██▋ \| 129/483 [02:49<08:51, 1.50s/it] [2024-07-23 17:46:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	27%\|██▋ \| 129/483 [02:49<08:51, 1.50s/it] [2024-07-23 17:46:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	27%\|██▋ \| 129/483 [02:50<08:51, 1.50s/it] 27%\|██▋ \| 132/483 [02:51<05:30, 1.06it/s] [2024-07-23 17:46:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	27%\|██▋ \| 132/483 [02:52<05:30, 1.06it/s] 28%\|██▊ \| 133/483 [02:54<08:53, 1.53s/it] [2024-07-23 17:46:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	28%\|██▊ \| 133/483 [02:54<08:53, 1.53s/it] 28%\|██▊ \| 134/483 [02:54<06:57, 1.20s/it] [2024-07-23 17:46:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	28%\|██▊ \| 134/483 [02:55<06:57, 1.20s/it] 28%\|██▊ \| 135/483 [02:55<05:59, 1.03s/it] [2024-07-23 17:46:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.27.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	28%\|██▊ \| 135/483 [02:55<05:59, 1.03s/it] 28%\|██▊ \| 136/483 [02:55<05:00, 1.15it/s] [2024-07-23 17:46:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	28%\|██▊ \| 136/483 [02:55<05:00, 1.15it/s] [2024-07-23 17:46:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	28%\|██▊ \| 136/483 [02:56<05:00, 1.15it/s] 29%\|██▊ \| 138/483 [02:57<04:41, 1.23it/s] [2024-07-23 17:46:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	29%\|██▊ \| 138/483 [02:58<04:41, 1.23it/s] 29%\|██▉ \| 139/483 [03:00<08:12, 1.43s/it] [2024-07-23 17:47:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	29%\|██▉ \| 139/483 [03:00<08:12, 1.43s/it] 29%\|██▉ \| 140/483 [03:00<06:17, 1.10s/it] [2024-07-23 17:47:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	29%\|██▉ \| 140/483 [03:01<06:17, 1.10s/it] 29%\|██▉ \| 141/483 [03:01<05:25, 1.05it/s] [2024-07-23 17:47:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.28.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	29%\|██▉ \| 141/483 [03:01<05:25, 1.05it/s] 29%\|██▉ \| 142/483 [03:01<04:32, 1.25it/s] [2024-07-23 17:47:02] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00012-of-00030.safetensors
	29%\|██▉ \| 142/483 [03:01<04:32, 1.25it/s] [2024-07-23 17:47:07] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	29%\|██▉ \| 142/483 [03:07<04:32, 1.25it/s] 30%\|██▉ \| 143/483 [03:08<14:34, 2.57s/it] [2024-07-23 17:47:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	30%\|██▉ \| 143/483 [03:09<14:34, 2.57s/it] 30%\|██▉ \| 144/483 [03:09<11:23, 2.02s/it] [2024-07-23 17:47:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	30%\|██▉ \| 144/483 [03:09<11:23, 2.02s/it] 30%\|███ \| 145/483 [03:09<08:42, 1.55s/it] [2024-07-23 17:47:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	30%\|███ \| 145/483 [03:09<08:42, 1.55s/it] [2024-07-23 17:47:11] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	30%\|███ \| 145/483 [03:10<08:42, 1.55s/it] 30%\|███ \| 147/483 [03:11<06:33, 1.17s/it] [2024-07-23 17:47:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.29.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	30%\|███ \| 147/483 [03:11<06:33, 1.17s/it] [2024-07-23 17:47:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	30%\|███ \| 147/483 [03:11<06:33, 1.17s/it] [2024-07-23 17:47:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	30%\|███ \| 147/483 [03:11<06:33, 1.17s/it] 31%\|███ \| 150/483 [03:12<04:37, 1.20it/s] [2024-07-23 17:47:14] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	31%\|███ \| 150/483 [03:14<04:37, 1.20it/s] 31%\|███▏ \| 151/483 [03:16<07:05, 1.28s/it] [2024-07-23 17:47:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	31%\|███▏ \| 151/483 [03:16<07:05, 1.28s/it] 31%\|███▏ \| 152/483 [03:16<05:41, 1.03s/it] [2024-07-23 17:47:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	31%\|███▏ \| 152/483 [03:16<05:41, 1.03s/it] 32%\|███▏ \| 153/483 [03:16<05:01, 1.09it/s] [2024-07-23 17:47:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.30.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	32%\|███▏ \| 153/483 [03:16<05:01, 1.09it/s] 32%\|███▏ \| 154/483 [03:17<04:18, 1.27it/s] [2024-07-23 17:47:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	32%\|███▏ \| 154/483 [03:17<04:18, 1.27it/s] [2024-07-23 17:47:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	32%\|███▏ \| 154/483 [03:17<04:18, 1.27it/s] 32%\|███▏ \| 156/483 [03:18<04:08, 1.32it/s] [2024-07-23 17:47:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	32%\|███▏ \| 156/483 [03:19<04:08, 1.32it/s] 33%\|███▎ \| 157/483 [03:21<07:08, 1.31s/it] [2024-07-23 17:47:22] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	33%\|███▎ \| 157/483 [03:21<07:08, 1.31s/it] 33%\|███▎ \| 158/483 [03:21<05:29, 1.02s/it] [2024-07-23 17:47:22] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	33%\|███▎ \| 158/483 [03:22<05:29, 1.02s/it] 33%\|███▎ \| 159/483 [03:22<04:48, 1.12it/s] [2024-07-23 17:47:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.31.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	33%\|███▎ \| 159/483 [03:22<04:48, 1.12it/s] 33%\|███▎ \| 160/483 [03:22<04:04, 1.32it/s] [2024-07-23 17:47:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	33%\|███▎ \| 160/483 [03:22<04:04, 1.32it/s] 33%\|███▎ \| 161/483 [03:23<03:42, 1.45it/s] [2024-07-23 17:47:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	33%\|███▎ \| 161/483 [03:23<03:42, 1.45it/s] 34%\|███▎ \| 162/483 [03:23<03:15, 1.65it/s] [2024-07-23 17:47:24] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00012-of-00030.safetensors
	34%\|███▎ \| 162/483 [03:23<03:15, 1.65it/s] [2024-07-23 17:47:24] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00011-of-00030.safetensors
	34%\|███▎ \| 162/483 [03:23<03:15, 1.65it/s] [2024-07-23 17:47:24] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00013-of-00030.safetensors
	34%\|███▎ \| 162/483 [03:23<03:15, 1.65it/s] [2024-07-23 17:47:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	34%\|███▎ \| 162/483 [03:26<03:15, 1.65it/s] 34%\|███▎ \| 163/483 [03:26<06:03, 1.14s/it] [2024-07-23 17:47:27] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	34%\|███▎ \| 163/483 [03:26<06:03, 1.14s/it] 34%\|███▍ \| 164/483 [03:27<06:30, 1.23s/it] [2024-07-23 17:47:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	34%\|███▍ \| 164/483 [03:28<06:30, 1.23s/it] 34%\|███▍ \| 165/483 [03:30<09:36, 1.81s/it] [2024-07-23 17:47:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.32.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	34%\|███▍ \| 165/483 [03:30<09:36, 1.81s/it] 34%\|███▍ \| 166/483 [03:30<06:54, 1.31s/it] [2024-07-23 17:47:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	34%\|███▍ \| 166/483 [03:30<06:54, 1.31s/it] [2024-07-23 17:47:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	34%\|███▍ \| 166/483 [03:31<06:54, 1.31s/it] 35%\|███▍ \| 168/483 [03:32<05:26, 1.04s/it] [2024-07-23 17:47:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	35%\|███▍ \| 168/483 [03:33<05:26, 1.04s/it] 35%\|███▍ \| 169/483 [03:35<08:14, 1.57s/it] [2024-07-23 17:47:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	35%\|███▍ \| 169/483 [03:35<08:14, 1.57s/it] 35%\|███▌ \| 170/483 [03:35<06:12, 1.19s/it] [2024-07-23 17:47:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	35%\|███▌ \| 170/483 [03:35<06:12, 1.19s/it] 35%\|███▌ \| 171/483 [03:36<05:15, 1.01s/it] [2024-07-23 17:47:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.33.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	35%\|███▌ \| 171/483 [03:36<05:15, 1.01s/it] 36%\|███▌ \| 172/483 [03:36<04:23, 1.18it/s] [2024-07-23 17:47:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	36%\|███▌ \| 172/483 [03:36<04:23, 1.18it/s] [2024-07-23 17:47:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	36%\|███▌ \| 172/483 [03:37<04:23, 1.18it/s] 36%\|███▌ \| 174/483 [03:38<04:04, 1.27it/s] [2024-07-23 17:47:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	36%\|███▌ \| 174/483 [03:39<04:04, 1.27it/s] 36%\|███▌ \| 175/483 [03:41<07:03, 1.38s/it] [2024-07-23 17:47:41] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	36%\|███▌ \| 175/483 [03:41<07:03, 1.38s/it] 36%\|███▋ \| 176/483 [03:41<05:22, 1.05s/it] [2024-07-23 17:47:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	36%\|███▋ \| 176/483 [03:41<05:22, 1.05s/it] 37%\|███▋ \| 177/483 [03:41<04:39, 1.10it/s] [2024-07-23 17:47:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.34.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	37%\|███▋ \| 177/483 [03:42<04:39, 1.10it/s] 37%\|███▋ \| 178/483 [03:42<03:57, 1.29it/s] [2024-07-23 17:47:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	37%\|███▋ \| 178/483 [03:42<03:57, 1.29it/s] 37%\|███▋ \| 179/483 [03:42<03:36, 1.41it/s] [2024-07-23 17:47:43] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00013-of-00030.safetensors
	37%\|███▋ \| 179/483 [03:42<03:36, 1.41it/s] [2024-07-23 17:47:43] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00014-of-00030.safetensors
	37%\|███▋ \| 179/483 [03:43<03:36, 1.41it/s] [2024-07-23 17:47:46] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	37%\|███▋ \| 179/483 [03:45<03:36, 1.41it/s] 37%\|███▋ \| 180/483 [03:45<06:41, 1.32s/it] [2024-07-23 17:47:46] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	37%\|███▋ \| 180/483 [03:46<06:41, 1.32s/it] 37%\|███▋ \| 181/483 [03:47<06:51, 1.36s/it] [2024-07-23 17:47:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	37%\|███▋ \| 181/483 [03:48<06:51, 1.36s/it] 38%\|███▊ \| 182/483 [03:50<09:37, 1.92s/it] [2024-07-23 17:47:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	38%\|███▊ \| 182/483 [03:50<09:37, 1.92s/it] 38%\|███▊ \| 183/483 [03:50<06:54, 1.38s/it] [2024-07-23 17:47:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.35.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	38%\|███▊ \| 183/483 [03:50<06:54, 1.38s/it] 38%\|███▊ \| 184/483 [03:50<05:25, 1.09s/it] [2024-07-23 17:47:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	38%\|███▊ \| 184/483 [03:50<05:25, 1.09s/it] [2024-07-23 17:47:52] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	38%\|███▊ \| 184/483 [03:51<05:25, 1.09s/it] 39%\|███▊ \| 186/483 [03:52<04:33, 1.09it/s] [2024-07-23 17:47:54] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	39%\|███▊ \| 186/483 [03:53<04:33, 1.09it/s] 39%\|███▊ \| 187/483 [03:55<07:18, 1.48s/it] [2024-07-23 17:47:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	39%\|███▊ \| 187/483 [03:55<07:18, 1.48s/it] 39%\|███▉ \| 188/483 [03:55<05:31, 1.12s/it] [2024-07-23 17:47:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	39%\|███▉ \| 188/483 [03:55<05:31, 1.12s/it] 39%\|███▉ \| 189/483 [03:56<04:43, 1.04it/s] [2024-07-23 17:47:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.36.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	39%\|███▉ \| 189/483 [03:56<04:43, 1.04it/s] 39%\|███▉ \| 190/483 [03:56<03:55, 1.24it/s] [2024-07-23 17:47:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	39%\|███▉ \| 190/483 [03:56<03:55, 1.24it/s] [2024-07-23 17:47:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	39%\|███▉ \| 190/483 [03:57<03:55, 1.24it/s] 40%\|███▉ \| 192/483 [03:58<03:43, 1.30it/s] [2024-07-23 17:48:00] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	40%\|███▉ \| 192/483 [03:59<03:43, 1.30it/s] 40%\|███▉ \| 193/483 [04:01<06:31, 1.35s/it] [2024-07-23 17:48:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	40%\|███▉ \| 193/483 [04:01<06:31, 1.35s/it] 40%\|████ \| 194/483 [04:01<04:58, 1.03s/it] [2024-07-23 17:48:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	40%\|████ \| 194/483 [04:01<04:58, 1.03s/it] 40%\|████ \| 195/483 [04:01<04:18, 1.11it/s] [2024-07-23 17:48:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.37.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	40%\|████ \| 195/483 [04:02<04:18, 1.11it/s] 41%\|████ \| 196/483 [04:02<03:37, 1.32it/s] [2024-07-23 17:48:02] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00014-of-00030.safetensors
	41%\|████ \| 196/483 [04:02<03:37, 1.32it/s] [2024-07-23 17:48:03] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00015-of-00030.safetensors
	41%\|████ \| 196/483 [04:02<03:37, 1.32it/s] [2024-07-23 17:48:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	41%\|████ \| 196/483 [04:04<03:37, 1.32it/s] 41%\|████ \| 197/483 [04:04<05:38, 1.18s/it] [2024-07-23 17:48:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	41%\|████ \| 197/483 [04:05<05:38, 1.18s/it] 41%\|████ \| 198/483 [04:05<05:58, 1.26s/it] [2024-07-23 17:48:07] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	41%\|████ \| 198/483 [04:07<05:58, 1.26s/it] 41%\|████ \| 199/483 [04:09<08:39, 1.83s/it] [2024-07-23 17:48:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	41%\|████ \| 199/483 [04:09<08:39, 1.83s/it] 41%\|████▏ \| 200/483 [04:09<06:13, 1.32s/it] [2024-07-23 17:48:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	41%\|████▏ \| 200/483 [04:09<06:13, 1.32s/it] 42%\|████▏ \| 201/483 [04:09<05:06, 1.09s/it] [2024-07-23 17:48:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.38.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	42%\|████▏ \| 201/483 [04:09<05:06, 1.09s/it] 42%\|████▏ \| 202/483 [04:10<04:08, 1.13it/s] [2024-07-23 17:48:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	42%\|████▏ \| 202/483 [04:10<04:08, 1.13it/s] [2024-07-23 17:48:11] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	42%\|████▏ \| 202/483 [04:10<04:08, 1.13it/s] 42%\|████▏ \| 204/483 [04:11<03:45, 1.24it/s] [2024-07-23 17:48:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	42%\|████▏ \| 204/483 [04:12<03:45, 1.24it/s] 42%\|████▏ \| 205/483 [04:14<06:28, 1.40s/it] [2024-07-23 17:48:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	42%\|████▏ \| 205/483 [04:14<06:28, 1.40s/it] 43%\|████▎ \| 206/483 [04:14<04:53, 1.06s/it] [2024-07-23 17:48:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	43%\|████▎ \| 206/483 [04:15<04:53, 1.06s/it] 43%\|████▎ \| 207/483 [04:15<04:12, 1.09it/s] [2024-07-23 17:48:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.39.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	43%\|████▎ \| 207/483 [04:15<04:12, 1.09it/s] 43%\|████▎ \| 208/483 [04:15<03:31, 1.30it/s] [2024-07-23 17:48:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	43%\|████▎ \| 208/483 [04:17<03:31, 1.30it/s] 43%\|████▎ \| 209/483 [04:19<06:40, 1.46s/it] [2024-07-23 17:48:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	43%\|████▎ \| 209/483 [04:19<06:40, 1.46s/it] 43%\|████▎ \| 210/483 [04:19<05:33, 1.22s/it] [2024-07-23 17:48:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	43%\|████▎ \| 210/483 [04:19<05:33, 1.22s/it] 44%\|████▎ \| 211/483 [04:20<04:26, 1.02it/s] [2024-07-23 17:48:20] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00015-of-00030.safetensors
	44%\|████▎ \| 211/483 [04:20<04:26, 1.02it/s] [2024-07-23 17:48:20] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00003-of-00030.safetensors
	44%\|████▎ \| 211/483 [04:20<04:26, 1.02it/s] [2024-07-23 17:48:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	44%\|████▎ \| 211/483 [04:23<04:26, 1.02it/s] 44%\|████▍ \| 212/483 [04:23<07:11, 1.59s/it] [2024-07-23 17:48:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	44%\|████▍ \| 212/483 [04:23<07:11, 1.59s/it] 44%\|████▍ \| 213/483 [04:24<06:59, 1.55s/it] [2024-07-23 17:48:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	44%\|████▍ \| 213/483 [04:25<06:59, 1.55s/it] 44%\|████▍ \| 214/483 [04:27<09:09, 2.04s/it] [2024-07-23 17:48:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.4.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	44%\|████▍ \| 214/483 [04:27<09:09, 2.04s/it] 45%\|████▍ \| 215/483 [04:27<06:32, 1.47s/it] [2024-07-23 17:48:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	45%\|████▍ \| 215/483 [04:27<06:32, 1.47s/it] [2024-07-23 17:48:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	45%\|████▍ \| 215/483 [04:28<06:32, 1.47s/it] 45%\|████▍ \| 217/483 [04:29<04:58, 1.12s/it] [2024-07-23 17:48:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	45%\|████▍ \| 217/483 [04:30<04:58, 1.12s/it] 45%\|████▌ \| 218/483 [04:32<07:12, 1.63s/it] [2024-07-23 17:48:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	45%\|████▌ \| 218/483 [04:32<07:12, 1.63s/it] 45%\|████▌ \| 219/483 [04:32<05:25, 1.23s/it] [2024-07-23 17:48:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	45%\|████▌ \| 219/483 [04:32<05:25, 1.23s/it] 46%\|████▌ \| 220/483 [04:33<04:33, 1.04s/it] [2024-07-23 17:48:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.5.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	46%\|████▌ \| 220/483 [04:33<04:33, 1.04s/it] 46%\|████▌ \| 221/483 [04:33<03:45, 1.16it/s] [2024-07-23 17:48:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	46%\|████▌ \| 221/483 [04:33<03:45, 1.16it/s] [2024-07-23 17:48:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	46%\|████▌ \| 221/483 [04:34<03:45, 1.16it/s] 46%\|████▌ \| 223/483 [04:34<03:27, 1.25it/s] [2024-07-23 17:48:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	46%\|████▌ \| 223/483 [04:36<03:27, 1.25it/s] 46%\|████▋ \| 224/483 [04:38<05:55, 1.37s/it] [2024-07-23 17:48:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	46%\|████▋ \| 224/483 [04:38<05:55, 1.37s/it] 47%\|████▋ \| 225/483 [04:38<04:30, 1.05s/it] [2024-07-23 17:48:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	47%\|████▋ \| 225/483 [04:38<04:30, 1.05s/it] 47%\|████▋ \| 226/483 [04:38<03:53, 1.10it/s] [2024-07-23 17:48:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.6.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	47%\|████▋ \| 226/483 [04:38<03:53, 1.10it/s] 47%\|████▋ \| 227/483 [04:39<03:16, 1.30it/s] [2024-07-23 17:48:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	47%\|████▋ \| 227/483 [04:39<03:16, 1.30it/s] 47%\|████▋ \| 228/483 [04:39<02:58, 1.43it/s] [2024-07-23 17:48:40] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00003-of-00030.safetensors
	47%\|████▋ \| 228/483 [04:39<02:58, 1.43it/s] [2024-07-23 17:48:40] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00016-of-00030.safetensors
	47%\|████▋ \| 228/483 [04:39<02:58, 1.43it/s] [2024-07-23 17:48:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	47%\|████▋ \| 228/483 [04:42<02:58, 1.43it/s] 47%\|████▋ \| 229/483 [04:42<05:11, 1.23s/it] [2024-07-23 17:48:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	47%\|████▋ \| 229/483 [04:42<05:11, 1.23s/it] 48%\|████▊ \| 230/483 [04:43<05:28, 1.30s/it] [2024-07-23 17:48:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.40.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	48%\|████▊ \| 230/483 [04:43<05:28, 1.30s/it] [2024-07-23 17:48:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	48%\|████▊ \| 230/483 [04:43<05:28, 1.30s/it] [2024-07-23 17:48:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	48%\|████▊ \| 230/483 [04:44<05:28, 1.30s/it] 48%\|████▊ \| 233/483 [04:45<03:32, 1.18it/s] [2024-07-23 17:48:47] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	48%\|████▊ \| 233/483 [04:46<03:32, 1.18it/s] 48%\|████▊ \| 234/483 [04:48<05:33, 1.34s/it] [2024-07-23 17:48:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	48%\|████▊ \| 234/483 [04:48<05:33, 1.34s/it] 49%\|████▊ \| 235/483 [04:48<04:22, 1.06s/it] [2024-07-23 17:48:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	49%\|████▊ \| 235/483 [04:48<04:22, 1.06s/it] 49%\|████▉ \| 236/483 [04:49<03:49, 1.07it/s] [2024-07-23 17:48:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.41.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	49%\|████▉ \| 236/483 [04:49<03:49, 1.07it/s] 49%\|████▉ \| 237/483 [04:49<03:14, 1.27it/s] [2024-07-23 17:48:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	49%\|████▉ \| 237/483 [04:49<03:14, 1.27it/s] [2024-07-23 17:48:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	49%\|████▉ \| 237/483 [04:50<03:14, 1.27it/s] 49%\|████▉ \| 239/483 [04:50<03:05, 1.31it/s] [2024-07-23 17:48:52] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	49%\|████▉ \| 239/483 [04:52<03:05, 1.31it/s] 50%\|████▉ \| 240/483 [04:54<05:24, 1.33s/it] [2024-07-23 17:48:54] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	50%\|████▉ \| 240/483 [04:54<05:24, 1.33s/it] 50%\|████▉ \| 241/483 [04:54<04:08, 1.03s/it] [2024-07-23 17:48:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	50%\|████▉ \| 241/483 [04:54<04:08, 1.03s/it] 50%\|█████ \| 242/483 [04:54<03:35, 1.12it/s] [2024-07-23 17:48:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.42.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	50%\|█████ \| 242/483 [04:54<03:35, 1.12it/s] 50%\|█████ \| 243/483 [04:55<03:02, 1.32it/s] [2024-07-23 17:48:55] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00017-of-00030.safetensors
	50%\|█████ \| 243/483 [04:55<03:02, 1.32it/s] [2024-07-23 17:49:00] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	50%\|█████ \| 243/483 [05:00<03:02, 1.32it/s] 51%\|█████ \| 244/483 [05:02<09:59, 2.51s/it] [2024-07-23 17:49:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	51%\|█████ \| 244/483 [05:02<09:59, 2.51s/it] 51%\|█████ \| 245/483 [05:02<07:49, 1.97s/it] [2024-07-23 17:49:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	51%\|█████ \| 245/483 [05:02<07:49, 1.97s/it] 51%\|█████ \| 246/483 [05:03<05:59, 1.52s/it] [2024-07-23 17:49:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	51%\|█████ \| 246/483 [05:03<05:59, 1.52s/it] [2024-07-23 17:49:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	51%\|█████ \| 246/483 [05:03<05:59, 1.52s/it] 51%\|█████▏ \| 248/483 [05:04<04:31, 1.15s/it] [2024-07-23 17:49:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.43.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	51%\|█████▏ \| 248/483 [05:04<04:31, 1.15s/it] [2024-07-23 17:49:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	51%\|█████▏ \| 248/483 [05:04<04:31, 1.15s/it] [2024-07-23 17:49:05] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	51%\|█████▏ \| 248/483 [05:05<04:31, 1.15s/it] 52%\|█████▏ \| 251/483 [05:06<03:11, 1.21it/s] [2024-07-23 17:49:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	52%\|█████▏ \| 251/483 [05:07<03:11, 1.21it/s] 52%\|█████▏ \| 252/483 [05:09<04:53, 1.27s/it] [2024-07-23 17:49:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	52%\|█████▏ \| 252/483 [05:09<04:53, 1.27s/it] 52%\|█████▏ \| 253/483 [05:09<03:55, 1.02s/it] [2024-07-23 17:49:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	52%\|█████▏ \| 253/483 [05:09<03:55, 1.02s/it] 53%\|█████▎ \| 254/483 [05:09<03:28, 1.10it/s] [2024-07-23 17:49:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.44.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	53%\|█████▎ \| 254/483 [05:10<03:28, 1.10it/s] 53%\|█████▎ \| 255/483 [05:10<02:58, 1.28it/s] [2024-07-23 17:49:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	53%\|█████▎ \| 255/483 [05:10<02:58, 1.28it/s] [2024-07-23 17:49:11] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	53%\|█████▎ \| 255/483 [05:10<02:58, 1.28it/s] 53%\|█████▎ \| 257/483 [05:11<02:50, 1.33it/s] [2024-07-23 17:49:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	53%\|█████▎ \| 257/483 [05:13<02:50, 1.33it/s] 53%\|█████▎ \| 258/483 [05:14<04:55, 1.31s/it] [2024-07-23 17:49:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	53%\|█████▎ \| 258/483 [05:14<04:55, 1.31s/it] 54%\|█████▎ \| 259/483 [05:14<03:47, 1.01s/it] [2024-07-23 17:49:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	54%\|█████▎ \| 259/483 [05:15<03:47, 1.01s/it] 54%\|█████▍ \| 260/483 [05:15<03:17, 1.13it/s] [2024-07-23 17:49:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.45.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	54%\|█████▍ \| 260/483 [05:15<03:17, 1.13it/s] 54%\|█████▍ \| 261/483 [05:15<02:47, 1.32it/s] [2024-07-23 17:49:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	54%\|█████▍ \| 261/483 [05:16<02:47, 1.32it/s] 54%\|█████▍ \| 262/483 [05:16<02:32, 1.45it/s] [2024-07-23 17:49:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	54%\|█████▍ \| 262/483 [05:16<02:32, 1.45it/s] 54%\|█████▍ \| 263/483 [05:16<02:13, 1.64it/s] [2024-07-23 17:49:17] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00016-of-00030.safetensors
	54%\|█████▍ \| 263/483 [05:16<02:13, 1.64it/s] [2024-07-23 17:49:17] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00017-of-00030.safetensors
	54%\|█████▍ \| 263/483 [05:16<02:13, 1.64it/s] [2024-07-23 17:49:17] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00018-of-00030.safetensors
	54%\|█████▍ \| 263/483 [05:17<02:13, 1.64it/s] [2024-07-23 17:49:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	54%\|█████▍ \| 263/483 [05:19<02:13, 1.64it/s] 55%\|█████▍ \| 264/483 [05:19<04:12, 1.15s/it] [2024-07-23 17:49:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	55%\|█████▍ \| 264/483 [05:19<04:12, 1.15s/it] 55%\|█████▍ \| 265/483 [05:20<04:30, 1.24s/it] [2024-07-23 17:49:22] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	55%\|█████▍ \| 265/483 [05:22<04:30, 1.24s/it] 55%\|█████▌ \| 266/483 [05:23<06:34, 1.82s/it] [2024-07-23 17:49:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.46.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	55%\|█████▌ \| 266/483 [05:24<06:34, 1.82s/it] 55%\|█████▌ \| 267/483 [05:24<04:42, 1.31s/it] [2024-07-23 17:49:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	55%\|█████▌ \| 267/483 [05:24<04:42, 1.31s/it] [2024-07-23 17:49:25] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	55%\|█████▌ \| 267/483 [05:24<04:42, 1.31s/it] 56%\|█████▌ \| 269/483 [05:25<03:41, 1.04s/it] [2024-07-23 17:49:27] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	56%\|█████▌ \| 269/483 [05:26<03:41, 1.04s/it] 56%\|█████▌ \| 270/483 [05:28<05:33, 1.57s/it] [2024-07-23 17:49:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	56%\|█████▌ \| 270/483 [05:28<05:33, 1.57s/it] 56%\|█████▌ \| 271/483 [05:28<04:11, 1.19s/it] [2024-07-23 17:49:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	56%\|█████▌ \| 271/483 [05:29<04:11, 1.19s/it] 56%\|█████▋ \| 272/483 [05:29<03:33, 1.01s/it] [2024-07-23 17:49:30] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.47.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	56%\|█████▋ \| 272/483 [05:29<03:33, 1.01s/it] 57%\|█████▋ \| 273/483 [05:29<02:57, 1.18it/s] [2024-07-23 17:49:30] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	57%\|█████▋ \| 273/483 [05:29<02:57, 1.18it/s] [2024-07-23 17:49:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	57%\|█████▋ \| 273/483 [05:30<02:57, 1.18it/s] 57%\|█████▋ \| 275/483 [05:31<02:44, 1.26it/s] [2024-07-23 17:49:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	57%\|█████▋ \| 275/483 [05:32<02:44, 1.26it/s] 57%\|█████▋ \| 276/483 [05:34<04:44, 1.37s/it] [2024-07-23 17:49:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	57%\|█████▋ \| 276/483 [05:34<04:44, 1.37s/it] 57%\|█████▋ \| 277/483 [05:34<03:35, 1.05s/it] [2024-07-23 17:49:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	57%\|█████▋ \| 277/483 [05:34<03:35, 1.05s/it] 58%\|█████▊ \| 278/483 [05:35<03:07, 1.10it/s] [2024-07-23 17:49:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.48.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	58%\|█████▊ \| 278/483 [05:35<03:07, 1.10it/s] 58%\|█████▊ \| 279/483 [05:35<02:38, 1.29it/s] [2024-07-23 17:49:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	58%\|█████▊ \| 279/483 [05:35<02:38, 1.29it/s] 58%\|█████▊ \| 280/483 [05:36<02:24, 1.41it/s] [2024-07-23 17:49:36] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00018-of-00030.safetensors
	58%\|█████▊ \| 280/483 [05:36<02:24, 1.41it/s] [2024-07-23 17:49:36] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00019-of-00030.safetensors
	58%\|█████▊ \| 280/483 [05:36<02:24, 1.41it/s] [2024-07-23 17:49:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	58%\|█████▊ \| 280/483 [05:38<02:24, 1.41it/s] 58%\|█████▊ \| 281/483 [05:38<04:31, 1.34s/it] [2024-07-23 17:49:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	58%\|█████▊ \| 281/483 [05:39<04:31, 1.34s/it] 58%\|█████▊ \| 282/483 [05:40<04:36, 1.38s/it] [2024-07-23 17:49:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	58%\|█████▊ \| 282/483 [05:41<04:36, 1.38s/it] 59%\|█████▊ \| 283/483 [05:43<06:23, 1.92s/it] [2024-07-23 17:49:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	59%\|█████▊ \| 283/483 [05:43<06:23, 1.92s/it] 59%\|█████▉ \| 284/483 [05:43<04:34, 1.38s/it] [2024-07-23 17:49:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.49.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	59%\|█████▉ \| 284/483 [05:43<04:34, 1.38s/it] 59%\|█████▉ \| 285/483 [05:44<03:35, 1.09s/it] [2024-07-23 17:49:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	59%\|█████▉ \| 285/483 [05:44<03:35, 1.09s/it] [2024-07-23 17:49:45] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	59%\|█████▉ \| 285/483 [05:44<03:35, 1.09s/it] 59%\|█████▉ \| 287/483 [05:45<03:00, 1.09it/s] [2024-07-23 17:49:47] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	59%\|█████▉ \| 287/483 [05:46<03:00, 1.09it/s] 60%\|█████▉ \| 288/483 [05:48<04:50, 1.49s/it] [2024-07-23 17:49:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	60%\|█████▉ \| 288/483 [05:48<04:50, 1.49s/it] 60%\|█████▉ \| 289/483 [05:48<03:38, 1.13s/it] [2024-07-23 17:49:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	60%\|█████▉ \| 289/483 [05:49<03:38, 1.13s/it] 60%\|██████ \| 290/483 [05:49<03:06, 1.03it/s] [2024-07-23 17:49:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.50.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	60%\|██████ \| 290/483 [05:49<03:06, 1.03it/s] 60%\|██████ \| 291/483 [05:49<02:34, 1.24it/s] [2024-07-23 17:49:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	60%\|██████ \| 291/483 [05:49<02:34, 1.24it/s] [2024-07-23 17:49:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	60%\|██████ \| 291/483 [05:50<02:34, 1.24it/s] 61%\|██████ \| 293/483 [05:51<02:26, 1.30it/s] [2024-07-23 17:49:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	61%\|██████ \| 293/483 [05:52<02:26, 1.30it/s] 61%\|██████ \| 294/483 [05:54<04:14, 1.35s/it] [2024-07-23 17:49:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	61%\|██████ \| 294/483 [05:54<04:14, 1.35s/it] 61%\|██████ \| 295/483 [05:54<03:13, 1.03s/it] [2024-07-23 17:49:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	61%\|██████ \| 295/483 [05:54<03:13, 1.03s/it] 61%\|██████▏ \| 296/483 [05:55<02:47, 1.12it/s] [2024-07-23 17:49:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.51.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	61%\|██████▏ \| 296/483 [05:55<02:47, 1.12it/s] 61%\|██████▏ \| 297/483 [05:55<02:20, 1.32it/s] [2024-07-23 17:49:56] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00019-of-00030.safetensors
	61%\|██████▏ \| 297/483 [05:55<02:20, 1.32it/s] [2024-07-23 17:49:56] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00020-of-00030.safetensors
	61%\|██████▏ \| 297/483 [05:55<02:20, 1.32it/s] [2024-07-23 17:49:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	61%\|██████▏ \| 297/483 [05:57<02:20, 1.32it/s] 62%\|██████▏ \| 298/483 [05:57<03:49, 1.24s/it] [2024-07-23 17:49:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	62%\|██████▏ \| 298/483 [05:58<03:49, 1.24s/it] 62%\|██████▏ \| 299/483 [05:59<03:59, 1.30s/it] [2024-07-23 17:50:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	62%\|██████▏ \| 299/483 [06:00<03:59, 1.30s/it] 62%\|██████▏ \| 300/483 [06:02<05:38, 1.85s/it] [2024-07-23 17:50:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	62%\|██████▏ \| 300/483 [06:02<05:38, 1.85s/it] 62%\|██████▏ \| 301/483 [06:02<04:03, 1.34s/it] [2024-07-23 17:50:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	62%\|██████▏ \| 301/483 [06:02<04:03, 1.34s/it] 63%\|██████▎ \| 302/483 [06:03<03:18, 1.10s/it] [2024-07-23 17:50:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.52.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	63%\|██████▎ \| 302/483 [06:03<03:18, 1.10s/it] 63%\|██████▎ \| 303/483 [06:03<02:40, 1.12it/s] [2024-07-23 17:50:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	63%\|██████▎ \| 303/483 [06:03<02:40, 1.12it/s] [2024-07-23 17:50:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	63%\|██████▎ \| 303/483 [06:04<02:40, 1.12it/s] 63%\|██████▎ \| 305/483 [06:05<02:24, 1.23it/s] [2024-07-23 17:50:07] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	63%\|██████▎ \| 305/483 [06:06<02:24, 1.23it/s] 63%\|██████▎ \| 306/483 [06:08<04:07, 1.40s/it] [2024-07-23 17:50:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	63%\|██████▎ \| 306/483 [06:08<04:07, 1.40s/it] 64%\|██████▎ \| 307/483 [06:08<03:06, 1.06s/it] [2024-07-23 17:50:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	64%\|██████▎ \| 307/483 [06:08<03:06, 1.06s/it] 64%\|██████▍ \| 308/483 [06:08<02:40, 1.09it/s] [2024-07-23 17:50:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.53.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	64%\|██████▍ \| 308/483 [06:09<02:40, 1.09it/s] 64%\|██████▍ \| 309/483 [06:09<02:14, 1.29it/s] [2024-07-23 17:50:11] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	64%\|██████▍ \| 309/483 [06:10<02:14, 1.29it/s] 64%\|██████▍ \| 310/483 [06:12<04:10, 1.45s/it] [2024-07-23 17:50:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	64%\|██████▍ \| 310/483 [06:12<04:10, 1.45s/it] 64%\|██████▍ \| 311/483 [06:13<03:28, 1.21s/it] [2024-07-23 17:50:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	64%\|██████▍ \| 311/483 [06:13<03:28, 1.21s/it] 65%\|██████▍ \| 312/483 [06:13<02:46, 1.03it/s] [2024-07-23 17:50:14] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00020-of-00030.safetensors
	65%\|██████▍ \| 312/483 [06:13<02:46, 1.03it/s] [2024-07-23 17:50:14] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00021-of-00030.safetensors
	65%\|██████▍ \| 312/483 [06:13<02:46, 1.03it/s] [2024-07-23 17:50:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	65%\|██████▍ \| 312/483 [06:16<02:46, 1.03it/s] 65%\|██████▍ \| 313/483 [06:16<04:06, 1.45s/it] [2024-07-23 17:50:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	65%\|██████▍ \| 313/483 [06:16<04:06, 1.45s/it] 65%\|██████▌ \| 314/483 [06:17<04:05, 1.45s/it] [2024-07-23 17:50:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.54.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	65%\|██████▌ \| 314/483 [06:17<04:05, 1.45s/it] [2024-07-23 17:50:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	65%\|██████▌ \| 314/483 [06:17<04:05, 1.45s/it] [2024-07-23 17:50:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	65%\|██████▌ \| 314/483 [06:18<04:05, 1.45s/it] 66%\|██████▌ \| 317/483 [06:18<02:31, 1.10it/s] [2024-07-23 17:50:20] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	66%\|██████▌ \| 317/483 [06:20<02:31, 1.10it/s] 66%\|██████▌ \| 318/483 [06:22<03:49, 1.39s/it] [2024-07-23 17:50:22] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	66%\|██████▌ \| 318/483 [06:22<03:49, 1.39s/it] 66%\|██████▌ \| 319/483 [06:22<02:59, 1.09s/it] [2024-07-23 17:50:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	66%\|██████▌ \| 319/483 [06:22<02:59, 1.09s/it] 66%\|██████▋ \| 320/483 [06:22<02:35, 1.05it/s] [2024-07-23 17:50:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.55.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	66%\|██████▋ \| 320/483 [06:22<02:35, 1.05it/s] 66%\|██████▋ \| 321/483 [06:23<02:10, 1.24it/s] [2024-07-23 17:50:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	66%\|██████▋ \| 321/483 [06:23<02:10, 1.24it/s] [2024-07-23 17:50:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	66%\|██████▋ \| 321/483 [06:23<02:10, 1.24it/s] 67%\|██████▋ \| 323/483 [06:24<02:03, 1.30it/s] [2024-07-23 17:50:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	67%\|██████▋ \| 323/483 [06:25<02:03, 1.30it/s] 67%\|██████▋ \| 324/483 [06:27<03:32, 1.34s/it] [2024-07-23 17:50:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	67%\|██████▋ \| 324/483 [06:27<03:32, 1.34s/it] 67%\|██████▋ \| 325/483 [06:27<02:42, 1.03s/it] [2024-07-23 17:50:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	67%\|██████▋ \| 325/483 [06:28<02:42, 1.03s/it] 67%\|██████▋ \| 326/483 [06:28<02:20, 1.12it/s] [2024-07-23 17:50:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.56.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	67%\|██████▋ \| 326/483 [06:28<02:20, 1.12it/s] 68%\|██████▊ \| 327/483 [06:28<01:58, 1.32it/s] [2024-07-23 17:50:29] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00022-of-00030.safetensors
	68%\|██████▊ \| 327/483 [06:28<01:58, 1.32it/s] [2024-07-23 17:50:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	68%\|██████▊ \| 327/483 [06:33<01:58, 1.32it/s] 68%\|██████▊ \| 328/483 [06:35<06:17, 2.44s/it] [2024-07-23 17:50:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	68%\|██████▊ \| 328/483 [06:35<06:17, 2.44s/it] 68%\|██████▊ \| 329/483 [06:36<04:55, 1.92s/it] [2024-07-23 17:50:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	68%\|██████▊ \| 329/483 [06:36<04:55, 1.92s/it] 68%\|██████▊ \| 330/483 [06:36<03:46, 1.48s/it] [2024-07-23 17:50:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	68%\|██████▊ \| 330/483 [06:36<03:46, 1.48s/it] [2024-07-23 17:50:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	68%\|██████▊ \| 330/483 [06:37<03:46, 1.48s/it] 69%\|██████▊ \| 332/483 [06:38<02:51, 1.13s/it] [2024-07-23 17:50:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.57.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	69%\|██████▊ \| 332/483 [06:38<02:51, 1.13s/it] [2024-07-23 17:50:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	69%\|██████▊ \| 332/483 [06:38<02:51, 1.13s/it] [2024-07-23 17:50:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	69%\|██████▊ \| 332/483 [06:38<02:51, 1.13s/it] 69%\|██████▉ \| 335/483 [06:39<02:00, 1.23it/s] [2024-07-23 17:50:41] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	69%\|██████▉ \| 335/483 [06:40<02:00, 1.23it/s] 70%\|██████▉ \| 336/483 [06:42<03:06, 1.27s/it] [2024-07-23 17:50:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	70%\|██████▉ \| 336/483 [06:42<03:06, 1.27s/it] 70%\|██████▉ \| 337/483 [06:42<02:28, 1.02s/it] [2024-07-23 17:50:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	70%\|██████▉ \| 337/483 [06:42<02:28, 1.02s/it] 70%\|██████▉ \| 338/483 [06:43<02:10, 1.11it/s] [2024-07-23 17:50:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.58.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	70%\|██████▉ \| 338/483 [06:43<02:10, 1.11it/s] 70%\|███████ \| 339/483 [06:43<01:51, 1.29it/s] [2024-07-23 17:50:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	70%\|███████ \| 339/483 [06:43<01:51, 1.29it/s] [2024-07-23 17:50:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	70%\|███████ \| 339/483 [06:44<01:51, 1.29it/s] 71%\|███████ \| 341/483 [06:45<01:46, 1.33it/s] [2024-07-23 17:50:47] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	71%\|███████ \| 341/483 [06:46<01:46, 1.33it/s] 71%\|███████ \| 342/483 [06:48<03:04, 1.31s/it] [2024-07-23 17:50:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	71%\|███████ \| 342/483 [06:48<03:04, 1.31s/it] 71%\|███████ \| 343/483 [06:48<02:21, 1.01s/it] [2024-07-23 17:50:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	71%\|███████ \| 343/483 [06:48<02:21, 1.01s/it] 71%\|███████ \| 344/483 [06:48<02:03, 1.13it/s] [2024-07-23 17:50:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.59.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	71%\|███████ \| 344/483 [06:49<02:03, 1.13it/s] 71%\|███████▏ \| 345/483 [06:49<01:44, 1.33it/s] [2024-07-23 17:50:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	71%\|███████▏ \| 345/483 [06:49<01:44, 1.33it/s] 72%\|███████▏ \| 346/483 [06:49<01:34, 1.45it/s] [2024-07-23 17:50:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	72%\|███████▏ \| 346/483 [06:50<01:34, 1.45it/s] 72%\|███████▏ \| 347/483 [06:50<01:22, 1.65it/s] [2024-07-23 17:50:50] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00021-of-00030.safetensors
	72%\|███████▏ \| 347/483 [06:50<01:22, 1.65it/s] [2024-07-23 17:50:51] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00022-of-00030.safetensors
	72%\|███████▏ \| 347/483 [06:50<01:22, 1.65it/s] [2024-07-23 17:50:51] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00023-of-00030.safetensors
	72%\|███████▏ \| 347/483 [06:50<01:22, 1.65it/s] [2024-07-23 17:50:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	72%\|███████▏ \| 347/483 [06:52<01:22, 1.65it/s] 72%\|███████▏ \| 348/483 [06:52<02:41, 1.20s/it] [2024-07-23 17:50:54] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	72%\|███████▏ \| 348/483 [06:53<02:41, 1.20s/it] 72%\|███████▏ \| 349/483 [06:54<02:50, 1.27s/it] [2024-07-23 17:50:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	72%\|███████▏ \| 349/483 [06:55<02:50, 1.27s/it] 72%\|███████▏ \| 350/483 [06:57<04:04, 1.84s/it] [2024-07-23 17:50:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.60.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	72%\|███████▏ \| 350/483 [06:57<04:04, 1.84s/it] 73%\|███████▎ \| 351/483 [06:57<02:54, 1.32s/it] [2024-07-23 17:50:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	73%\|███████▎ \| 351/483 [06:57<02:54, 1.32s/it] [2024-07-23 17:50:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	73%\|███████▎ \| 351/483 [06:58<02:54, 1.32s/it] 73%\|███████▎ \| 353/483 [06:59<02:15, 1.04s/it] [2024-07-23 17:51:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	73%\|███████▎ \| 353/483 [07:00<02:15, 1.04s/it] 73%\|███████▎ \| 354/483 [07:02<03:22, 1.57s/it] [2024-07-23 17:51:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	73%\|███████▎ \| 354/483 [07:02<03:22, 1.57s/it] 73%\|███████▎ \| 355/483 [07:02<02:32, 1.19s/it] [2024-07-23 17:51:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	73%\|███████▎ \| 355/483 [07:02<02:32, 1.19s/it] 74%\|███████▎ \| 356/483 [07:02<02:08, 1.01s/it] [2024-07-23 17:51:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.61.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	74%\|███████▎ \| 356/483 [07:03<02:08, 1.01s/it] 74%\|███████▍ \| 357/483 [07:03<01:46, 1.18it/s] [2024-07-23 17:51:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	74%\|███████▍ \| 357/483 [07:03<01:46, 1.18it/s] [2024-07-23 17:51:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	74%\|███████▍ \| 357/483 [07:03<01:46, 1.18it/s] 74%\|███████▍ \| 359/483 [07:04<01:37, 1.27it/s] [2024-07-23 17:51:06] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	74%\|███████▍ \| 359/483 [07:06<01:37, 1.27it/s] 75%\|███████▍ \| 360/483 [07:07<02:48, 1.37s/it] [2024-07-23 17:51:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	75%\|███████▍ \| 360/483 [07:08<02:48, 1.37s/it] 75%\|███████▍ \| 361/483 [07:08<02:07, 1.04s/it] [2024-07-23 17:51:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	75%\|███████▍ \| 361/483 [07:08<02:07, 1.04s/it] 75%\|███████▍ \| 362/483 [07:08<01:49, 1.10it/s] [2024-07-23 17:51:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.62.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	75%\|███████▍ \| 362/483 [07:08<01:49, 1.10it/s] 75%\|███████▌ \| 363/483 [07:09<01:32, 1.29it/s] [2024-07-23 17:51:09] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	75%\|███████▌ \| 363/483 [07:09<01:32, 1.29it/s] 75%\|███████▌ \| 364/483 [07:09<01:24, 1.41it/s] [2024-07-23 17:51:10] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00023-of-00030.safetensors
	75%\|███████▌ \| 364/483 [07:09<01:24, 1.41it/s] [2024-07-23 17:51:10] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00024-of-00030.safetensors
	75%\|███████▌ \| 364/483 [07:09<01:24, 1.41it/s] [2024-07-23 17:51:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	75%\|███████▌ \| 364/483 [07:12<01:24, 1.41it/s] 76%\|███████▌ \| 365/483 [07:12<02:39, 1.35s/it] [2024-07-23 17:51:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	76%\|███████▌ \| 365/483 [07:13<02:39, 1.35s/it] 76%\|███████▌ \| 366/483 [07:13<02:41, 1.38s/it] [2024-07-23 17:51:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	76%\|███████▌ \| 366/483 [07:15<02:41, 1.38s/it] 76%\|███████▌ \| 367/483 [07:17<03:42, 1.92s/it] [2024-07-23 17:51:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	76%\|███████▌ \| 367/483 [07:17<03:42, 1.92s/it] 76%\|███████▌ \| 368/483 [07:17<02:38, 1.38s/it] [2024-07-23 17:51:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.63.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	76%\|███████▌ \| 368/483 [07:17<02:38, 1.38s/it] 76%\|███████▋ \| 369/483 [07:17<02:04, 1.09s/it] [2024-07-23 17:51:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	76%\|███████▋ \| 369/483 [07:17<02:04, 1.09s/it] [2024-07-23 17:51:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	76%\|███████▋ \| 369/483 [07:18<02:04, 1.09s/it] 77%\|███████▋ \| 371/483 [07:19<01:42, 1.09it/s] [2024-07-23 17:51:21] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	77%\|███████▋ \| 371/483 [07:20<01:42, 1.09it/s] 77%\|███████▋ \| 372/483 [07:22<02:44, 1.48s/it] [2024-07-23 17:51:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	77%\|███████▋ \| 372/483 [07:22<02:44, 1.48s/it] 77%\|███████▋ \| 373/483 [07:22<02:03, 1.12s/it] [2024-07-23 17:51:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	77%\|███████▋ \| 373/483 [07:22<02:03, 1.12s/it] 77%\|███████▋ \| 374/483 [07:22<01:44, 1.04it/s] [2024-07-23 17:51:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.64.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	77%\|███████▋ \| 374/483 [07:23<01:44, 1.04it/s] 78%\|███████▊ \| 375/483 [07:23<01:26, 1.25it/s] [2024-07-23 17:51:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	78%\|███████▊ \| 375/483 [07:23<01:26, 1.25it/s] [2024-07-23 17:51:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	78%\|███████▊ \| 375/483 [07:23<01:26, 1.25it/s] 78%\|███████▊ \| 377/483 [07:24<01:21, 1.31it/s] [2024-07-23 17:51:26] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	78%\|███████▊ \| 377/483 [07:26<01:21, 1.31it/s] 78%\|███████▊ \| 378/483 [07:27<02:21, 1.35s/it] [2024-07-23 17:51:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	78%\|███████▊ \| 378/483 [07:27<02:21, 1.35s/it] 78%\|███████▊ \| 379/483 [07:28<01:46, 1.03s/it] [2024-07-23 17:51:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	78%\|███████▊ \| 379/483 [07:28<01:46, 1.03s/it] 79%\|███████▊ \| 380/483 [07:28<01:32, 1.12it/s] [2024-07-23 17:51:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.65.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	79%\|███████▊ \| 380/483 [07:28<01:32, 1.12it/s] 79%\|███████▉ \| 381/483 [07:28<01:17, 1.32it/s] [2024-07-23 17:51:29] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00024-of-00030.safetensors
	79%\|███████▉ \| 381/483 [07:28<01:17, 1.32it/s] [2024-07-23 17:51:29] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00025-of-00030.safetensors
	79%\|███████▉ \| 381/483 [07:29<01:17, 1.32it/s] [2024-07-23 17:51:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	79%\|███████▉ \| 381/483 [07:31<01:17, 1.32it/s] 79%\|███████▉ \| 382/483 [07:31<01:56, 1.15s/it] [2024-07-23 17:51:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	79%\|███████▉ \| 382/483 [07:31<01:56, 1.15s/it] 79%\|███████▉ \| 383/483 [07:32<02:03, 1.24s/it] [2024-07-23 17:51:34] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	79%\|███████▉ \| 383/483 [07:33<02:03, 1.24s/it] 80%\|███████▉ \| 384/483 [07:35<02:58, 1.80s/it] [2024-07-23 17:51:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	80%\|███████▉ \| 384/483 [07:35<02:58, 1.80s/it] 80%\|███████▉ \| 385/483 [07:35<02:07, 1.30s/it] [2024-07-23 17:51:36] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	80%\|███████▉ \| 385/483 [07:36<02:07, 1.30s/it] 80%\|███████▉ \| 386/483 [07:36<01:44, 1.07s/it] [2024-07-23 17:51:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.66.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	80%\|███████▉ \| 386/483 [07:36<01:44, 1.07s/it] 80%\|████████ \| 387/483 [07:36<01:23, 1.14it/s] [2024-07-23 17:51:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	80%\|████████ \| 387/483 [07:36<01:23, 1.14it/s] [2024-07-23 17:51:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	80%\|████████ \| 387/483 [07:37<01:23, 1.14it/s] 81%\|████████ \| 389/483 [07:38<01:15, 1.25it/s] [2024-07-23 17:51:40] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	81%\|████████ \| 389/483 [07:39<01:15, 1.25it/s] 81%\|████████ \| 390/483 [07:41<02:09, 1.39s/it] [2024-07-23 17:51:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	81%\|████████ \| 390/483 [07:41<02:09, 1.39s/it] 81%\|████████ \| 391/483 [07:41<01:37, 1.06s/it] [2024-07-23 17:51:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	81%\|████████ \| 391/483 [07:41<01:37, 1.06s/it] 81%\|████████ \| 392/483 [07:42<01:23, 1.09it/s] [2024-07-23 17:51:42] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.67.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	81%\|████████ \| 392/483 [07:42<01:23, 1.09it/s] 81%\|████████▏ \| 393/483 [07:42<01:09, 1.30it/s] [2024-07-23 17:51:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	81%\|████████▏ \| 393/483 [07:43<01:09, 1.30it/s] 82%\|████████▏ \| 394/483 [07:45<02:09, 1.46s/it] [2024-07-23 17:51:46] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	82%\|████████▏ \| 394/483 [07:45<02:09, 1.46s/it] 82%\|████████▏ \| 395/483 [07:46<01:47, 1.22s/it] [2024-07-23 17:51:47] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	82%\|████████▏ \| 395/483 [07:46<01:47, 1.22s/it] 82%\|████████▏ \| 396/483 [07:46<01:25, 1.02it/s] [2024-07-23 17:51:47] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00025-of-00030.safetensors
	82%\|████████▏ \| 396/483 [07:46<01:25, 1.02it/s] [2024-07-23 17:51:47] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00026-of-00030.safetensors
	82%\|████████▏ \| 396/483 [07:46<01:25, 1.02it/s] [2024-07-23 17:51:49] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	82%\|████████▏ \| 396/483 [07:49<01:25, 1.02it/s] 82%\|████████▏ \| 397/483 [07:49<02:05, 1.45s/it] [2024-07-23 17:51:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	82%\|████████▏ \| 397/483 [07:49<02:05, 1.45s/it] 82%\|████████▏ \| 398/483 [07:50<02:03, 1.45s/it] [2024-07-23 17:51:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.68.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	82%\|████████▏ \| 398/483 [07:50<02:03, 1.45s/it] [2024-07-23 17:51:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	82%\|████████▏ \| 398/483 [07:50<02:03, 1.45s/it] [2024-07-23 17:51:51] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	82%\|████████▏ \| 398/483 [07:51<02:03, 1.45s/it] 83%\|████████▎ \| 401/483 [07:52<01:14, 1.10it/s] [2024-07-23 17:51:54] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	83%\|████████▎ \| 401/483 [07:53<01:14, 1.10it/s] 83%\|████████▎ \| 402/483 [07:55<01:52, 1.39s/it] [2024-07-23 17:51:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	83%\|████████▎ \| 402/483 [07:55<01:52, 1.39s/it] 83%\|████████▎ \| 403/483 [07:55<01:27, 1.09s/it] [2024-07-23 17:51:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	83%\|████████▎ \| 403/483 [07:55<01:27, 1.09s/it] 84%\|████████▎ \| 404/483 [07:55<01:15, 1.05it/s] [2024-07-23 17:51:56] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.69.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	84%\|████████▎ \| 404/483 [07:56<01:15, 1.05it/s] 84%\|████████▍ \| 405/483 [07:56<01:03, 1.24it/s] [2024-07-23 17:51:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	84%\|████████▍ \| 405/483 [07:56<01:03, 1.24it/s] [2024-07-23 17:51:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	84%\|████████▍ \| 405/483 [07:56<01:03, 1.24it/s] 84%\|████████▍ \| 407/483 [07:57<00:58, 1.30it/s] [2024-07-23 17:51:59] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	84%\|████████▍ \| 407/483 [07:59<00:58, 1.30it/s] 84%\|████████▍ \| 408/483 [08:01<01:40, 1.34s/it] [2024-07-23 17:52:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	84%\|████████▍ \| 408/483 [08:01<01:40, 1.34s/it] 85%\|████████▍ \| 409/483 [08:01<01:16, 1.03s/it] [2024-07-23 17:52:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	85%\|████████▍ \| 409/483 [08:01<01:16, 1.03s/it] 85%\|████████▍ \| 410/483 [08:01<01:05, 1.11it/s] [2024-07-23 17:52:02] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.70.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	85%\|████████▍ \| 410/483 [08:01<01:05, 1.11it/s] 85%\|████████▌ \| 411/483 [08:02<00:54, 1.31it/s] [2024-07-23 17:52:02] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00027-of-00030.safetensors
	85%\|████████▌ \| 411/483 [08:02<00:54, 1.31it/s] [2024-07-23 17:52:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	85%\|████████▌ \| 411/483 [08:07<00:54, 1.31it/s] 85%\|████████▌ \| 412/483 [08:09<03:03, 2.58s/it] [2024-07-23 17:52:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	85%\|████████▌ \| 412/483 [08:09<03:03, 2.58s/it] 86%\|████████▌ \| 413/483 [08:09<02:21, 2.02s/it] [2024-07-23 17:52:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	86%\|████████▌ \| 413/483 [08:10<02:21, 2.02s/it] 86%\|████████▌ \| 414/483 [08:10<01:47, 1.55s/it] [2024-07-23 17:52:10] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00026-of-00030.safetensors
	86%\|████████▌ \| 414/483 [08:10<01:47, 1.55s/it] [2024-07-23 17:52:11] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00027-of-00030.safetensors
	86%\|████████▌ \| 414/483 [08:10<01:47, 1.55s/it] [2024-07-23 17:52:11] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00004-of-00030.safetensors
	86%\|████████▌ \| 414/483 [08:10<01:47, 1.55s/it] [2024-07-23 17:52:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	86%\|████████▌ \| 414/483 [08:12<01:47, 1.55s/it] 86%\|████████▌ \| 415/483 [08:12<02:04, 1.84s/it] [2024-07-23 17:52:14] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	86%\|████████▌ \| 415/483 [08:13<02:04, 1.84s/it] 86%\|████████▌ \| 416/483 [08:14<01:55, 1.72s/it] [2024-07-23 17:52:16] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	86%\|████████▌ \| 416/483 [08:15<01:55, 1.72s/it] 86%\|████████▋ \| 417/483 [08:17<02:21, 2.15s/it] [2024-07-23 17:52:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	86%\|████████▋ \| 417/483 [08:17<02:21, 2.15s/it] 87%\|████████▋ \| 418/483 [08:17<01:40, 1.54s/it] [2024-07-23 17:52:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.7.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	87%\|████████▋ \| 418/483 [08:17<01:40, 1.54s/it] 87%\|████████▋ \| 419/483 [08:17<01:17, 1.21s/it] [2024-07-23 17:52:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	87%\|████████▋ \| 419/483 [08:17<01:17, 1.21s/it] [2024-07-23 17:52:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	87%\|████████▋ \| 419/483 [08:18<01:17, 1.21s/it] 87%\|████████▋ \| 421/483 [08:19<01:00, 1.02it/s] [2024-07-23 17:52:21] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	87%\|████████▋ \| 421/483 [08:20<01:00, 1.02it/s] 87%\|████████▋ \| 422/483 [08:22<01:33, 1.53s/it] [2024-07-23 17:52:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	87%\|████████▋ \| 422/483 [08:22<01:33, 1.53s/it] 88%\|████████▊ \| 423/483 [08:22<01:09, 1.16s/it] [2024-07-23 17:52:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	88%\|████████▊ \| 423/483 [08:22<01:09, 1.16s/it] 88%\|████████▊ \| 424/483 [08:23<00:58, 1.02it/s] [2024-07-23 17:52:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.8.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	88%\|████████▊ \| 424/483 [08:23<00:58, 1.02it/s] 88%\|████████▊ \| 425/483 [08:23<00:47, 1.22it/s] [2024-07-23 17:52:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	88%\|████████▊ \| 425/483 [08:23<00:47, 1.22it/s] [2024-07-23 17:52:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	88%\|████████▊ \| 425/483 [08:24<00:47, 1.22it/s] 88%\|████████▊ \| 427/483 [08:25<00:43, 1.29it/s] [2024-07-23 17:52:27] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	88%\|████████▊ \| 427/483 [08:26<00:43, 1.29it/s] 89%\|████████▊ \| 428/483 [08:28<01:14, 1.36s/it] [2024-07-23 17:52:28] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	89%\|████████▊ \| 428/483 [08:28<01:14, 1.36s/it] 89%\|████████▉ \| 429/483 [08:28<00:56, 1.04s/it] [2024-07-23 17:52:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	89%\|████████▉ \| 429/483 [08:28<00:56, 1.04s/it] 89%\|████████▉ \| 430/483 [08:28<00:47, 1.11it/s] [2024-07-23 17:52:29] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.9.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	89%\|████████▉ \| 430/483 [08:29<00:47, 1.11it/s] 89%\|████████▉ \| 431/483 [08:29<00:39, 1.31it/s] [2024-07-23 17:52:29] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00004-of-00030.safetensors
	89%\|████████▉ \| 431/483 [08:29<00:39, 1.31it/s] [2024-07-23 17:52:30] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00027-of-00030.safetensors
	89%\|████████▉ \| 431/483 [08:29<00:39, 1.31it/s] [2024-07-23 17:52:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	89%\|████████▉ \| 431/483 [08:30<00:39, 1.31it/s] 89%\|████████▉ \| 432/483 [08:30<00:48, 1.06it/s] [2024-07-23 17:52:31] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	89%\|████████▉ \| 432/483 [08:31<00:48, 1.06it/s] 90%\|████████▉ \| 433/483 [08:32<00:54, 1.09s/it] [2024-07-23 17:52:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.71.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	90%\|████████▉ \| 433/483 [08:32<00:54, 1.09s/it] [2024-07-23 17:52:32] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	90%\|████████▉ \| 433/483 [08:32<00:54, 1.09s/it] [2024-07-23 17:52:33] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	90%\|████████▉ \| 433/483 [08:32<00:54, 1.09s/it] 90%\|█████████ \| 436/483 [08:33<00:35, 1.32it/s] [2024-07-23 17:52:35] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	90%\|█████████ \| 436/483 [08:34<00:35, 1.32it/s] 90%\|█████████ \| 437/483 [08:36<00:57, 1.26s/it] [2024-07-23 17:52:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	90%\|█████████ \| 437/483 [08:36<00:57, 1.26s/it] 91%\|█████████ \| 438/483 [08:36<00:44, 1.01it/s] [2024-07-23 17:52:37] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	91%\|█████████ \| 438/483 [08:37<00:44, 1.01it/s] 91%\|█████████ \| 439/483 [08:37<00:38, 1.14it/s] [2024-07-23 17:52:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.72.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	91%\|█████████ \| 439/483 [08:37<00:38, 1.14it/s] 91%\|█████████ \| 440/483 [08:37<00:32, 1.33it/s] [2024-07-23 17:52:38] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	91%\|█████████ \| 440/483 [08:37<00:32, 1.33it/s] [2024-07-23 17:52:39] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	91%\|█████████ \| 440/483 [08:38<00:32, 1.33it/s] 92%\|█████████▏\| 442/483 [08:39<00:30, 1.35it/s] [2024-07-23 17:52:41] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	92%\|█████████▏\| 442/483 [08:40<00:30, 1.35it/s] 92%\|█████████▏\| 443/483 [08:42<00:52, 1.31s/it] [2024-07-23 17:52:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	92%\|█████████▏\| 443/483 [08:42<00:52, 1.31s/it] 92%\|█████████▏\| 444/483 [08:42<00:39, 1.01s/it] [2024-07-23 17:52:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	92%\|█████████▏\| 444/483 [08:42<00:39, 1.01s/it] 92%\|█████████▏\| 445/483 [08:43<00:33, 1.13it/s] [2024-07-23 17:52:43] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.73.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	92%\|█████████▏\| 445/483 [08:43<00:33, 1.13it/s] 92%\|█████████▏\| 446/483 [08:43<00:27, 1.33it/s] [2024-07-23 17:52:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	92%\|█████████▏\| 446/483 [08:43<00:27, 1.33it/s] 93%\|█████████▎\| 447/483 [08:43<00:24, 1.46it/s] [2024-07-23 17:52:44] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	93%\|█████████▎\| 447/483 [08:44<00:24, 1.46it/s] 93%\|█████████▎\| 448/483 [08:44<00:21, 1.65it/s] [2024-07-23 17:52:45] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00027-of-00030.safetensors
	93%\|█████████▎\| 448/483 [08:44<00:21, 1.65it/s] [2024-07-23 17:52:45] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00028-of-00030.safetensors
	93%\|█████████▎\| 448/483 [08:44<00:21, 1.65it/s] [2024-07-23 17:52:47] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	93%\|█████████▎\| 448/483 [08:47<00:21, 1.65it/s] 93%\|█████████▎\| 449/483 [08:47<00:43, 1.28s/it] [2024-07-23 17:52:48] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	93%\|█████████▎\| 449/483 [08:47<00:43, 1.28s/it] 93%\|█████████▎\| 450/483 [08:48<00:43, 1.33s/it] [2024-07-23 17:52:50] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	93%\|█████████▎\| 450/483 [08:50<00:43, 1.33s/it] 93%\|█████████▎\| 451/483 [08:51<01:00, 1.89s/it] [2024-07-23 17:52:52] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.74.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	93%\|█████████▎\| 451/483 [08:52<01:00, 1.89s/it] 94%\|█████████▎\| 452/483 [08:52<00:42, 1.36s/it] [2024-07-23 17:52:52] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	94%\|█████████▎\| 452/483 [08:52<00:42, 1.36s/it] [2024-07-23 17:52:53] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	94%\|█████████▎\| 452/483 [08:52<00:42, 1.36s/it] 94%\|█████████▍\| 454/483 [08:53<00:30, 1.06s/it] [2024-07-23 17:52:55] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	94%\|█████████▍\| 454/483 [08:54<00:30, 1.06s/it] 94%\|█████████▍\| 455/483 [08:56<00:44, 1.58s/it] [2024-07-23 17:52:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	94%\|█████████▍\| 455/483 [08:56<00:44, 1.58s/it] 94%\|█████████▍\| 456/483 [08:56<00:32, 1.20s/it] [2024-07-23 17:52:57] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	94%\|█████████▍\| 456/483 [08:57<00:32, 1.20s/it] 95%\|█████████▍\| 457/483 [08:57<00:26, 1.02s/it] [2024-07-23 17:52:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.75.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	95%\|█████████▍\| 457/483 [08:57<00:26, 1.02s/it] 95%\|█████████▍\| 458/483 [08:57<00:21, 1.17it/s] [2024-07-23 17:52:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	95%\|█████████▍\| 458/483 [08:57<00:21, 1.17it/s] [2024-07-23 17:52:58] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	95%\|█████████▍\| 458/483 [08:58<00:21, 1.17it/s] 95%\|█████████▌\| 460/483 [08:59<00:18, 1.26it/s] [2024-07-23 17:53:01] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	95%\|█████████▌\| 460/483 [09:00<00:18, 1.26it/s] 95%\|█████████▌\| 461/483 [09:02<00:30, 1.37s/it] [2024-07-23 17:53:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	95%\|█████████▌\| 461/483 [09:02<00:30, 1.37s/it] 96%\|█████████▌\| 462/483 [09:02<00:21, 1.05s/it] [2024-07-23 17:53:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	96%\|█████████▌\| 462/483 [09:02<00:21, 1.05s/it] 96%\|█████████▌\| 463/483 [09:03<00:18, 1.10it/s] [2024-07-23 17:53:03] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.76.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	96%\|█████████▌\| 463/483 [09:03<00:18, 1.10it/s] 96%\|█████████▌\| 464/483 [09:03<00:14, 1.29it/s] [2024-07-23 17:53:04] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	96%\|█████████▌\| 464/483 [09:03<00:14, 1.29it/s] 96%\|█████████▋\| 465/483 [09:03<00:12, 1.41it/s] [2024-07-23 17:53:04] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00028-of-00030.safetensors
	96%\|█████████▋\| 465/483 [09:03<00:12, 1.41it/s] [2024-07-23 17:53:04] INFO huggingface_loader.py:185: Loading HF parameters from: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00029-of-00030.safetensors
	96%\|█████████▋\| 465/483 [09:04<00:12, 1.41it/s] [2024-07-23 17:53:07] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	96%\|█████████▋\| 465/483 [09:06<00:12, 1.41it/s] 96%\|█████████▋\| 466/483 [09:06<00:23, 1.36s/it] [2024-07-23 17:53:08] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	96%\|█████████▋\| 466/483 [09:07<00:23, 1.36s/it] 97%\|█████████▋\| 467/483 [09:08<00:22, 1.39s/it] [2024-07-23 17:53:10] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	97%\|█████████▋\| 467/483 [09:09<00:22, 1.39s/it] 97%\|█████████▋\| 468/483 [09:11<00:28, 1.92s/it] [2024-07-23 17:53:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	97%\|█████████▋\| 468/483 [09:11<00:28, 1.92s/it] 97%\|█████████▋\| 469/483 [09:11<00:19, 1.38s/it] [2024-07-23 17:53:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.77.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	97%\|█████████▋\| 469/483 [09:11<00:19, 1.38s/it] 97%\|█████████▋\| 470/483 [09:12<00:14, 1.09s/it] [2024-07-23 17:53:12] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	97%\|█████████▋\| 470/483 [09:12<00:14, 1.09s/it] [2024-07-23 17:53:13] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	97%\|█████████▋\| 470/483 [09:12<00:14, 1.09s/it] 98%\|█████████▊\| 472/483 [09:13<00:10, 1.09it/s] [2024-07-23 17:53:15] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	98%\|█████████▊\| 472/483 [09:14<00:10, 1.09it/s] 98%\|█████████▊\| 473/483 [09:16<00:14, 1.48s/it] [2024-07-23 17:53:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	98%\|█████████▊\| 473/483 [09:16<00:14, 1.48s/it] 98%\|█████████▊\| 474/483 [09:16<00:10, 1.12s/it] [2024-07-23 17:53:17] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	98%\|█████████▊\| 474/483 [09:17<00:10, 1.12s/it] 98%\|█████████▊\| 475/483 [09:17<00:07, 1.04it/s] [2024-07-23 17:53:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.78.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	98%\|█████████▊\| 475/483 [09:17<00:07, 1.04it/s] 99%\|█████████▊\| 476/483 [09:17<00:05, 1.25it/s] [2024-07-23 17:53:18] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.input_layernorm.weight[0m", shape: (8192,), dtype: float16
	99%\|█████████▊\| 476/483 [09:17<00:05, 1.25it/s] [2024-07-23 17:53:19] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.mlp.down_proj.weight[0m", shape: (8192, 28672), dtype: float16
	99%\|█████████▊\| 476/483 [09:18<00:05, 1.25it/s] 99%\|█████████▉\| 478/483 [09:19<00:03, 1.30it/s] [2024-07-23 17:53:21] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.mlp.gate_up_proj.weight[0m", shape: (57344, 8192), dtype: float16
	99%\|█████████▉\| 478/483 [09:20<00:03, 1.30it/s] 99%\|█████████▉\| 479/483 [09:22<00:05, 1.35s/it] [2024-07-23 17:53:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.post_attention_layernorm.weight[0m", shape: (8192,), dtype: float16
	99%\|█████████▉\| 479/483 [09:22<00:05, 1.35s/it] 99%\|█████████▉\| 480/483 [09:22<00:03, 1.03s/it] [2024-07-23 17:53:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.self_attn.qkv_proj.weight[0m", shape: (10240, 8192), dtype: float16
	99%\|█████████▉\| 480/483 [09:22<00:03, 1.03s/it] 100%\|█████████▉\| 481/483 [09:23<00:01, 1.11it/s] [2024-07-23 17:53:23] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.layers.79.self_attn.o_proj.weight[0m", shape: (8192, 8192), dtype: float16
	100%\|█████████▉\| 481/483 [09:23<00:01, 1.11it/s] 100%\|█████████▉\| 482/483 [09:23<00:00, 1.32it/s] [2024-07-23 17:53:24] INFO huggingface_loader.py:175: [Not quantized] Parameter: "[1mmodel.norm.weight[0m", shape: (8192,), dtype: float16
	100%\|█████████▉\| 482/483 [09:23<00:00, 1.32it/s] 100%\|██████████\| 483/483 [09:23<00:00, 1.17s/it]
	[2024-07-23 17:53:24] INFO huggingface_loader.py:197: Unloading HF weight file: /Users/Shared/models/Meta-Llama-3.1-70B-Instruct/model-00029-of-00030.safetensors
	[2024-07-23 17:53:24] INFO stats.py:77: [92mTime usage[0m: HF loading: 82.243 sec; Pre-quantization mapping: 178.396 sec; Quantization: 0.000 sec
	[2024-07-23 17:53:24] INFO stats.py:91: [92mRAM usage[0m: Peak RAM: 17.375 GB. Total bytes loaded from disk: 271.521 GB
	[2024-07-23 17:53:24] INFO convert_weight.py:155: [92mParameter size[0m after quantization: 131.417 GB
	[2024-07-23 17:53:24] INFO convert_weight.py:160: [92mTotal parameters[0m: 72,885,788,672
	[2024-07-23 17:53:24] INFO convert_weight.py:161: [92mBits per parameter[0m: 15.488
	[2024-07-23 17:53:24] INFO convert_weight.py:166: Saved to directory: [1mlocal_dir/Llama-3.1-70B-Instruct-q0f16-MLC[0m

	All finished, 323 total shards committed, record saved to local_dir/Llama-3.1-70B-Instruct-q0f16-MLC/ndarray-cache.json