error message

#2
by ehartford - opened

I tried to run your exact same yaml file on a different 70b model, and I got this:

  File "/Users/eric/git/mergekit/mergekit/options.py", line 58, in wrapper
    f(*args, **kwargs)
  File "/Users/eric/git/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/Users/eric/git/mergekit/mergekit/merge.py", line 110, in run_merge
    exec.run(
  File "/Users/eric/git/mergekit/mergekit/graph.py", line 254, in run
    writer.save_tensor(ref.key, tensor, clone=clone_tensors)
  File "/Users/eric/git/mergekit/mergekit/io/tensor_writer.py", line 51, in save_tensor
    self.flush_current_shard()
  File "/Users/eric/git/mergekit/mergekit/io/tensor_writer.py", line 68, in flush_current_shard
    safetensors.torch.save_file(
  File "/Users/eric/miniconda3/envs/mergekit/lib/python3.11/site-packages/safetensors/torch.py", line 281, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
                   ^^^^^^^^^^^^^^^^^
  File "/Users/eric/miniconda3/envs/mergekit/lib/python3.11/site-packages/safetensors/torch.py", line 467, in _flatten
    raise RuntimeError(
RuntimeError:
            Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'model.layers.20.input_layernorm.weight', 'model.layers.10.input_layernorm.weight'}].
            A potential way to correctly save your model is to use `save_model`.
            More information at https://huggingface.co/docs/safetensors/torch_shared_tensors

Any idea what I did wrong?

$ cat ~/models/Dolphin-120b/mergekit_config.yml 
dtype: float16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 20]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [10, 30]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [20, 40]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [30, 50]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [40, 60]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [50, 70]
    model: cognitivecomputations/dolphin-2.2-70b
- sources:
  - layer_range: [60, 80]
    model: cognitivecomputations/dolphin-2.2-70b

That's strange, your yaml file looks fine. Are you running mergekit using the command line or Jupyter notebook?

Well I got past it by making two symlinks to the model directory and making it think I was pulling in 3 different models

ehartford changed discussion status to closed

Sign up or log in to comment