mgoin commited on
Commit
20ff8de
1 Parent(s): 0e2b154

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ Converted and quantized checkpoint of [nvidia/Nemotron-4-340B-Instruct](https://
17
 
18
  You can deploy this model with `vllm>=0.5.4` ([PR#6611](https://github.com/vllm-project/vllm/pull/6611)):
19
  ```
20
- vllm serve mgoin/Nemotron-4-340B-Instruct-hf --tensor-parallel-size 16
21
  ```
22
 
23
  ### Evaluations
 
17
 
18
  You can deploy this model with `vllm>=0.5.4` ([PR#6611](https://github.com/vllm-project/vllm/pull/6611)):
19
  ```
20
+ vllm serve mgoin/Nemotron-4-340B-Instruct-hf-FP8 --tensor-parallel-size 8
21
  ```
22
 
23
  ### Evaluations