4090 Graphics Card 24gb Video Memory Not Enough

#2
by xldistance - opened

max_position_embeddings40000,cache mode FP8,Takes up 70g of video memory,4090 Graphics card can't talk properly

See how_i_run_34b_models_at_75k_context_on_24gb_fast

I am making exl2 quantizations right now.

Sign up or log in to comment