fix integration with huggingface
under construction ⚙️
do not merge yet until I test it
PR related info
All good (i think), i don't have enough compute power to test this out since i'm on free tier, so let me know if everything is running like it's supposed to be
you can test this pr before merging via the following code
pip install -qU "transformers>=4.39.1" flash_attn
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Tensoic/Cerule-v0.1", trust_remote_code=True,
revision="refs/pr/2" # the revision parameter is only used to run the code from this pr
)
I also updated the readme to let people know how to use the model.
tips
when working with custom architectures I recommend using huggingface's PyTorchModelHubMixin I also made a basic template on how to use it in this github repo integrating it with pip.
If you have any more questions or feedbacks or if you have any other custom models do not hesitate to reach out
Damn THANKS A LOT! will test it out asap
Hey
@not-lain
why did you change the model type to phi-msft here?
https://huggingface.co/Tensoic/Cerule-v0.1/commit/4a02b161d5142cd92a2082aae885bc3cc9584aca
@adarshxs
you are right this pr is absolutely useless XD.
the only thing that was broken was my colab envirenment.
all I had to do from the beginning is
!pip install -qU "transformers>=4.39.1" flash_attn
I'm closing this pr
but i'm keeping pr/3 open since the _name_or_path
is essential for cases such as finetuning