Resolve - 196 [rank0]: triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 180224, Hardware limit: 101376. Reducing block sizes or `num_stages` may help.
#33
by
moidhassan
- opened
Changing the num_stages to 1 in the code as suggested here - https://huggingface.co/microsoft/Phi-3-small-8k-instruct/discussions/15
These changes are to fix 2 different issues encountered while fine-tuning Phi-3-small-128k-instruct model with LORA.