both torch.float16 and torch.float32 having the same generation speed (~7s per 20 steps in GPU)
torch.float16
torch.float32
· Sign up or log in to comment