Text Generation
Transformers
PyTorch
English
llama
Eval Results
text-generation-inference
Inference Endpoints
Pankaj Mathur commited on
Commit
6a9156d
1 Parent(s): 7511d16

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,7 +24,7 @@ The training configurations are provided in the table below.
24
 
25
  The training takes on 8x A100(80G) GPUs and lasts for around 15 Hours for cost of $180 using [Lambda Labs](https://lambdalabs.com)
26
 
27
- We used DeepSpeed with Zero-3 approaches for parallel gpu training by writing our own fine tunning scripts plus leveraging some of the model training code provided by amazing [OpenAlpaca repo](https://github.com/yxuansu/OpenAlpaca)
28
 
29
  Here are some of params used during training:
30
 
 
24
 
25
  The training takes on 8x A100(80G) GPUs and lasts for around 15 Hours for cost of $180 using [Lambda Labs](https://lambdalabs.com)
26
 
27
+ We used DeepSpeed with fully sharded data parallelism, also know as [ZeRO stage 3](https://engineering.fb.com/2021/07/15/open-source/fsdp/) by writing our own fine tunning scripts plus leveraging some of the model training code provided by amazing [OpenAlpaca repo](https://github.com/yxuansu/OpenAlpaca)
28
 
29
  Here are some of params used during training:
30