mgoin commited on
Commit
ba6bbe2
1 Parent(s): 9c83fef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,7 +17,7 @@ Converted and quantized checkpoint of [nvidia/Nemotron-4-340B-Base](https://hugg
17
 
18
  You can deploy this model with `vllm>=0.5.4` ([PR#6611](https://github.com/vllm-project/vllm/pull/6611)):
19
  ```
20
- vllm serve mgoin/Nemotron-4-340B-Instruct-hf --tensor-parallel-size 16
21
  ```
22
 
23
  ### Evaluations
 
17
 
18
  You can deploy this model with `vllm>=0.5.4` ([PR#6611](https://github.com/vllm-project/vllm/pull/6611)):
19
  ```
20
+ vllm serve mgoin/Nemotron-4-340B-Base-hf-FP8 --tensor-parallel-size 8
21
  ```
22
 
23
  ### Evaluations