There is currently a bug in the code where manually setting the device for sentence transformer to a non-zero device triggers an issue with tensors not being on the same device. This modification is made during the initialization of attention_bias and does not introduce additional risks or inference time overhead.

infgrad changed pull request status to merged

Sign up or log in to comment