Xenova HF staff commited on
Commit
64b7b9a
1 Parent(s): ef87b2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -33
README.md CHANGED
@@ -8,7 +8,7 @@ language:
8
  - hi
9
  - es
10
  - th
11
- library_name: transformers
12
  pipeline_tag: text-generation
13
  tags:
14
  - facebook
@@ -250,50 +250,50 @@ The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a
250
 
251
  **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card.
252
 
253
- ## How to use
254
 
255
- This repository contains two versions of Llama-3.2-1B-Instruct, for use with `transformers` and with the original `llama` codebase.
256
-
257
- ### Use with transformers
 
258
 
259
- Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
 
 
260
 
261
- Make sure to update your transformers installation via `pip install --upgrade transformers`.
 
 
 
262
 
263
- ```python
264
- import torch
265
- from transformers import pipeline
 
 
266
 
267
- model_id = "meta-llama/Llama-3.2-1B-Instruct"
268
- pipe = pipeline(
269
- "text-generation",
270
- model=model_id,
271
- torch_dtype=torch.bfloat16,
272
- device_map="auto",
273
- )
274
- messages = [
275
- {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
276
- {"role": "user", "content": "Who are you?"},
277
- ]
278
- outputs = pipe(
279
- messages,
280
- max_new_tokens=256,
281
- )
282
- print(outputs[0]["generated_text"][-1])
283
  ```
284
 
285
- Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
286
-
287
- ### Use with `llama`
288
 
289
- Please, follow the instructions in the [repository](https://github.com/meta-llama/llama)
290
-
291
- To download Original checkpoints, see the example command below leveraging `huggingface-cli`:
292
 
293
  ```
294
- huggingface-cli download meta-llama/Llama-3.2-1B-Instruct --include "original/*" --local-dir Llama-3.2-1B-Instruct
295
  ```
296
 
 
 
 
 
 
 
 
 
 
297
  ## Hardware and Software
298
 
299
  **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure.
 
8
  - hi
9
  - es
10
  - th
11
+ library_name: transformers.js
12
  pipeline_tag: text-generation
13
  tags:
14
  - facebook
 
250
 
251
  **Out of Scope:** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3.2 Community License. Use in languages beyond those explicitly referenced as supported in this model card.
252
 
253
+ ## Usage (Transformers.js)
254
 
255
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
256
+ ```bash
257
+ npm i @huggingface/transformers
258
+ ```
259
 
260
+ You can then generate text as follows:
261
+ ```js
262
+ import { pipeline } from '@huggingface/transformers';
263
 
264
+ // Create a text generation pipeline
265
+ const generator = await pipeline('text-generation', 'onnx-community/Llama-3.2-1B-Instruct-q4f16', {
266
+ device: 'webgpu', // <- Run on WebGPU
267
+ });
268
 
269
+ // Define the list of messages
270
+ const messages = [
271
+ { role: "system", content: "You are a helpful assistant." },
272
+ { role: "user", content: "What is the capital of France?" },
273
+ ];
274
 
275
+ // Generate a response
276
+ const output = await generator(messages, { max_new_tokens: 128 });
277
+ console.log(output[0].generated_text.at(-1).content);
 
 
 
 
 
 
 
 
 
 
 
 
 
278
  ```
279
 
280
+ <details>
 
 
281
 
282
+ <summary>Example output</summary>
 
 
283
 
284
  ```
285
+ The capital of France is Paris.
286
  ```
287
 
288
+ </details>
289
+
290
+ > [!NOTE]
291
+ > We also support loading the library from a CDN, so you can import it using:
292
+ >
293
+ > ```js
294
+ > import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers';
295
+ > ```
296
+
297
  ## Hardware and Software
298
 
299
  **Training Factors:** We used custom training libraries, Meta's custom built GPU cluster, and production infrastructure for pretraining. Fine-tuning, annotation, and evaluation were also performed on production infrastructure.