onnx-community/Phi-3.5-mini-instruct-onnx-web · Transformers 3.0 can't find files

Sep 30

•

Hi, sorry if I'm missing something, I'm moving this across from a different thread here at the suggestion of @BoscoTheDog but I'm not expecting anyone to go read that thread. (Thanks for your help so far Bosco, much appreciated!)

I'm trying to get a simple proof-of-concept up-and-running that I can build on, using Phi 3.5 and the CDN-hosted transformers.js library but I keep getting 404 errors and I don't fully understand what I'm doing wrong.

I'm importing the library for the pipeline like this

import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";```

Then I'm loading the model like this

const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web');

But I keep getting 404 errors

[email protected]:217 
       GET https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-onnx-web/resolve/main/onnx/model_quantized.onnx 404 (Not Found)
[email protected]:217 Uncaught Error: Could not locate file: "https://huggingface.co/onnx-community/Phi-3.5-mini-instruct-onnx-web/resolve/main/onnx/model_quantized.onnx".
    at [email protected]:217:5325
    at h ([email protected]:217:5348)
    at async [email protected]:175:15938
    at async [email protected]:175:13612
    at async Promise.all (index 0)
    at async P ([email protected]:175:13530)
    at async Promise.all (index 0)
    at async wr.from_pretrained ([email protected]:175:21979)
    at async Do.from_pretrained ([email protected]:175:57753)
    at async Promise.all (index 1)

That suggests to me that the library isn't supported, but I can see from the listing that it is?

I've tried all of the following models, the one that's worked is "tiny-random-PhiForCausalLM". I quite quickly saw that is not what I'm looking for, I think it's a lorem ipsum generator, but at least that shows me that my code can work if I can figure out where to request the library from.

onnx-community/Phi-3.5-mini-instruct-onnx-web
Xenova/Phi-3-mini-4k-instruct
microsoft/Phi-3-mini-4k-instruct-onnx-web
Xenova/tiny-random-PhiForCausalLM
Xenova/phi-1_5_dev
BricksDisplay/phi-1_5
BricksDisplay/phi-1_5-q4
BricksDisplay/phi-1_5-bnb4
Xenova/Phi-3-mini-4k-instruct_fp16
Xenova/tiny-random-LlavaForConditionalGeneration_phi

I feel like I'm spinning my wheels on this and any help to point me in the right direction would make a huge difference. Thanks in advance!

Xenova

ONNX Community org Sep 30

In this case, you need to specify the correct dtype and device (it's a special optimized model for WebGPU). The following should work:

import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";

const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
  dtype: 'q4f16', device: 'webgpu'
});

We are in the process of adding default values to the config which would mean you don't need to do this in future. Hope this helps!

r0-0rd

Oct 1

•

edited Oct 1

Thanks very much for the response @Xenova ! Appreciate that pointer, it's helped me get further! As an aside - I've been so pleasantly surprised by how willing people are to help here :-)

I've got it loading, I still seem to be hitting an error with

I seem to be hitting this error now:

Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor model.layers.4.attn.o_proj.MatMul.weight_Q4 failed.Failed to load external data file ""model_q4f16.onnx_data"", error: Module.MountedFiles is not available.
    at Ve ([email protected]:100:73169)
    at Zu ([email protected]:100:357172)

It appears with these warnings, but I don't think they prevent the model from running:

1.

2024-10-01 09:28:44.905599 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.

2.

2024-10-01 09:28:44.906799 [W:onnxruntime:, session_state.cc:1170 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

In case it helps, the below is my full code - I'm running on a Macbook Pro in Chrome (exact same setup that I used to try out the Phi demo). I initially tried just running the code locally with VS Code Live Server but just in case that was the culprit I also uploaded to my server and tried loading the page there but got the same errors

const status = document.getElementById("status");
status.textContent = "Loading model...";
import { pipeline, env } from "https://cdn.jsdelivr.net/npm/@huggingface/[email protected]";

try {  
  console.log("Started model loading")
  const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
    dtype: 'q4f16', device: 'webgpu'
  });
  console.log("Finished model loading")
  status.textContent = "Ready";
  generateText(languageModel);
} catch (err) {
  console.log(err)
  status.textContent = "Error - failed to load";
}

Thanks again for the help!

r0-0rd

Oct 1

Made a quick edit to the above - realised I was doing something silly with my error handling (eesh). I have updated with the actual error I'm getting!

BoscoTheDog

ONNX Community org Oct 1

Never seen that one before 0_0

Xenova

ONNX Community org Oct 2

Oh whoops - you also need to add use_external_data_format: true as an option:

const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
  dtype: 'q4f16', device: 'webgpu', use_external_data_format: true,
});

r0-0rd

Oct 6

Hi there! Sorry for the delay in response - I wanted to double-check before I came back again. It looks like I'm getting the same kind of odd error.

I think I'm going to end up building what I planned in Angular anyway, for unrelated reasons, so I think I should be able to use the npm version of this library which may behave differently so I'll give that a crack.

Thanks very much for the time and effort to help! Sorry I can't report more success with this method!

hitchhiker3010

9 days ago

•

edited 9 days ago

Oh whoops - you also need to add use_external_data_format: true as an option:

const languageModel = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
  dtype: 'q4f16', device: 'webgpu', use_external_data_format: true,
});

Hi
I'm running into a similar issue too.

Error

Error: Error: Can't create a session. ERROR_CODE: 1, ERROR_MESSAGE: Deserialize tensor model.layers.6.mlp.up_proj.MatMul.weight_Q4 failed.Failed to load external data file ""model_q4f16.onnx_data"", error: Module.MountedFiles is not available.
    at Ve ([email protected]:100:73359)
    at ed ([email protected]:100:365982)

Full Code

<!DOCTYPE html>
<html>
<head>
  <title>Test Transformers.js</title>
  <script type="module">
    async function testSummarization() {
      try {
        // Load transformers.js
        const { env, AutoTokenizer, AutoModelForCausalLM, pipeline } = await import('https://cdn.jsdelivr.net/npm/@huggingface/[email protected]');
        console.log('Transformers.js loaded'); // Debugging statement
        env.allowLocalModels = false
        // Load the summarization pipeline
        const summarizationPipeline = await pipeline('text-generation', 'onnx-community/Phi-3.5-mini-instruct-onnx-web', {
  dtype: 'q4f16', use_external_data_format: true,
});
        console.log('Summarization pipeline loaded'); // Debugging statement

        // Run the summarization
        const text = 'The text you want to summarize';
        const result = await summarizationPipeline(text, { max_length: 130, min_length: 30, length_penalty: 2.0, num_beams: 4 });
        console.log('Summarization result:', result); // Debugging statement

        console.log(result[0].summary_text);
      } catch (error) {
        console.error('Error:', error);
      }
    }

    testSummarization();
  </script>
</head>
<body>
  <h1>Test Transformers.js</h1>
</body>
</html>

Please help