LLM Inference Web Demo

# GEMINI NANO EXTRACTED FROM CHROME CANARY VERSION 128.0.6557.0 If you have the particular version(>128), you can access the model here on windows: C:\Users\USERNAME\AppData\Local\Google\Chrome SxS\User Data\OptGuideOnDeviceModel\[version]\weights.bin You can signup for the built-in AI program here: https://developer.chrome.com/docs/ai/built-in The model seems to be of the TFLite model format with the following architecture: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6579ab1a95e559712cc09eae/ijTSh0CXcYkp5AqMJkf0V.png) ## Run the model using Mediapipe: Instructions from here: https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/llm_inference/js In a directory, make two files: `index.html`: ```html LLM Inference Web Demo Input:

Result:
``` `index.js`(Replace line no. 23 to the path to the .bin file): ```js // Copyright 2024 The MediaPipe Authors. // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // http://www.apache.org/licenses/LICENSE-2.0 // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // ---------------------------------------------------------------------------------------- // import {FilesetResolver, LlmInference} from 'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai'; const input = document.getElementById('input'); const output = document.getElementById('output'); const submit = document.getElementById('submit'); const modelFileName = 'weights.bin'; /* PATH TO MODEL .bin */ /** * Display newly generated partial results to the output text box. */ function displayPartialResults(partialResults, complete) { output.textContent += partialResults; if (complete) { if (!output.textContent) { output.textContent = 'Result is empty'; } submit.disabled = false; } } /** * Main function to run LLM Inference. */ async function runDemo() { const genaiFileset = await FilesetResolver.forGenAiTasks( 'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai/wasm'); let llmInference; submit.onclick = () => { output.textContent = ''; submit.disabled = true; llmInference.generateResponse(input.value, displayPartialResults); }; submit.value = 'Loading the model...' LlmInference .createFromOptions(genaiFileset, { baseOptions: {modelAssetPath: modelFileName}, // maxTokens: 512, // The maximum number of tokens (input tokens + output // // tokens) the model handles. // randomSeed: 1, // The random seed used during text generation. // topK: 1, // The number of tokens the model considers at each step of // // generation. Limits predictions to the top k most-probable // // tokens. Setting randomSeed is required for this to make // // effects. // temperature: // 1.0, // The amount of randomness introduced during generation. // // Setting randomSeed is required for this to make effects. }) .then(llm => { llmInference = llm; submit.disabled = false; submit.value = 'Get Response' }) .catch(() => { alert('Failed to initialize the task.'); }); } runDemo(); ``` ## Run using: `python3 -m http.server 8000` in the same directory (or python -m SimpleHTTPServer 8000 for older python versions). Finally, `Open localhost:8000 in Chrome` ## License I dont own shit. Just extracted the model weights stored as a .bin file. https://policies.google.com/terms/generative-ai/use-policy