error message
#2
by
ehartford
- opened
I tried to run your exact same yaml file on a different 70b model, and I got this:
File "/Users/eric/git/mergekit/mergekit/options.py", line 58, in wrapper
f(*args, **kwargs)
File "/Users/eric/git/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
run_merge(
File "/Users/eric/git/mergekit/mergekit/merge.py", line 110, in run_merge
exec.run(
File "/Users/eric/git/mergekit/mergekit/graph.py", line 254, in run
writer.save_tensor(ref.key, tensor, clone=clone_tensors)
File "/Users/eric/git/mergekit/mergekit/io/tensor_writer.py", line 51, in save_tensor
self.flush_current_shard()
File "/Users/eric/git/mergekit/mergekit/io/tensor_writer.py", line 68, in flush_current_shard
safetensors.torch.save_file(
File "/Users/eric/miniconda3/envs/mergekit/lib/python3.11/site-packages/safetensors/torch.py", line 281, in save_file
serialize_file(_flatten(tensors), filename, metadata=metadata)
^^^^^^^^^^^^^^^^^
File "/Users/eric/miniconda3/envs/mergekit/lib/python3.11/site-packages/safetensors/torch.py", line 467, in _flatten
raise RuntimeError(
RuntimeError:
Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'model.layers.20.input_layernorm.weight', 'model.layers.10.input_layernorm.weight'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
Any idea what I did wrong?
$ cat ~/models/Dolphin-120b/mergekit_config.yml
dtype: float16
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 20]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [10, 30]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [20, 40]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [30, 50]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [40, 60]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [50, 70]
model: cognitivecomputations/dolphin-2.2-70b
- sources:
- layer_range: [60, 80]
model: cognitivecomputations/dolphin-2.2-70b
That's strange, your yaml file looks fine. Are you running mergekit using the command line or Jupyter notebook?
Well I got past it by making two symlinks to the model directory and making it think I was pulling in 3 different models
ehartford
changed discussion status to
closed