Text Generation
Transformers
PyTorch
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference

AWS Sagemaker deployment is failing

#52
by varuntejay - opened

I followed the following code to write inference script for this model and it was throwing model load failure error. Can someone please let me know if any workaround for sagemaker deployment.

https://medium.com/@manoranjan.rajguru/deploy-mosaicml-mpt-7b-instruct-on-sagemaker-54730f88729b

Instance used: ml.g5.12xlarge

Error message:
2023-06-19T12:38:05,607 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Backend worker process died.

@abhi-mosaic can you pls help here. thanks!

@sam-mosaic @abhi-mosaic Would appreciate your response as we tried both instruct & chat models and both are giving the same issue. Thanks!

Hi @MohitHugging @varuntejay — I do not have experience with inference on sagemaker. The article you link to is interesting, but not something put out by Mosaic. I will pass on any resources I find.

sam-mosaic changed discussion status to closed

Appreciate your response. Are there any suggested link/steps you suggest to follow.

Hi I'm using the same instance to deploy the same model and getting the same error? How did you manage to get it through?

Sign up or log in to comment