yunmorning commited on
Commit
01bd3fb
1 Parent(s): 55f5a6d

Update docker run command

Browse files
Files changed (1) hide show
  1. README.md +10 -20
README.md CHANGED
@@ -49,7 +49,6 @@ This model is compatible with **[Friendli Container](https://friendli.ai/product
49
  - Before you begin, make sure you have signed up for [Friendli Suite](https://suite.friendli.ai/). **You can use Friendli Containers free of charge for four weeks.**
50
  - Prepare a Personal Access Token following [this guide](#preparing-personal-access-token).
51
  - Prepare a Friendli Container Secret following [this guide](#preparing-container-secret).
52
- - Install Hugging Face CLI with `pip install -U "huggingface_hub[cli]"`
53
 
54
  ### Preparing Personal Access Token
55
 
@@ -88,25 +87,16 @@ You should pass the container secret as an environment variable to run the conta
88
  Once you've prepared the image of Friendli Container, you can launch it to create a serving endpoint.
89
 
90
  ```sh
91
- export MODEL_DIR=$PWD/FriendliAI--Llama-2-70b-chat-hf-fp8
92
- export FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET"
93
- export FRIENDLI_CONTAINER_IMAGE="registry.friendli.ai/trial"
94
- export GPU_ENUMERATION='"device=0,1"'
95
-
96
- huggingface-cli download FriendliAI/Llama-2-70b-chat-hf-fp8 \
97
- --local-dir $MODEL_DIR \
98
- --local-dir-use-symlinks False
99
-
100
  docker run \
101
- --gpus $GPU_ENUMERATION --network=host --ipc=host \
102
- -v $MODEL_DIR:/model \
103
- -e FRIENDLI_CONTAINER_SECRET=$FRIENDLI_CONTAINER_SECRET \
104
- $FRIENDLI_CONTAINER_IMAGE /bin/bash -c \
105
- "/root/launcher \
106
- --web-server-port 6000 \
107
- --num-devices 2 \
108
- --ckpt-path /model \
109
- --ckpt-type hf_safetensors"
110
  ```
111
 
112
  ---
@@ -146,7 +136,7 @@ Meta developed and publicly released the Llama 2 family of large language models
146
 
147
  **License** A custom commercial license is available at: [https://ai.meta.com/resources/models-and-libraries/llama-downloads/](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
148
 
149
- **Research Paper** ["Llama-2: Open Foundation and Fine-tuned Chat Models"](arxiv.org/abs/2307.09288)
150
 
151
  ## Intended Use
152
  **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.
 
49
  - Before you begin, make sure you have signed up for [Friendli Suite](https://suite.friendli.ai/). **You can use Friendli Containers free of charge for four weeks.**
50
  - Prepare a Personal Access Token following [this guide](#preparing-personal-access-token).
51
  - Prepare a Friendli Container Secret following [this guide](#preparing-container-secret).
 
52
 
53
  ### Preparing Personal Access Token
54
 
 
87
  Once you've prepared the image of Friendli Container, you can launch it to create a serving endpoint.
88
 
89
  ```sh
 
 
 
 
 
 
 
 
 
90
  docker run \
91
+ --gpus '"device=0,1"' \
92
+ -p 8000:8000 \
93
+ -v ~/.cache/huggingface:/root/.cache/huggingface \
94
+ -e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
95
+ -e HF_TOKEN="YOUR HUGGING FACE TOKEN" \
96
+ registry.friendli.ai/trial \
97
+ --web-server-port 8000 \
98
+ --hf-model-name meta-llama/Llama-2-70b-chat-hf-fp8 \
99
+ --num-devices 2 # Use tensor parallelism degree 2
100
  ```
101
 
102
  ---
 
136
 
137
  **License** A custom commercial license is available at: [https://ai.meta.com/resources/models-and-libraries/llama-downloads/](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
138
 
139
+ **Research Paper** ["Llama-2: Open Foundation and Fine-tuned Chat Models"](https://arxiv.org/abs/2307.09288)
140
 
141
  ## Intended Use
142
  **Intended Use Cases** Llama 2 is intended for commercial and research use in English. Tuned models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.