Quick Fix: Rope Scaling or Rope Type Error

#67
by deepaksiloka - opened

Hi Everyone,
Instead of using the default image URI, use 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.3.0-tgi2.2.0-gpu-py310-cu121-ubuntu22.04-v2.0. If this requires specific access to ECR, you can pull the above image, create a new image based on it, and push it into your private ECR repository. Then, you can use the URL from your private repository directly.

I hope this will solve your problem :)
Thanks

It didnt work for me :(

What error you're getting?

I used a bitsandbytes quantized version that i fine tuned (specifically the unsloth version). It said unknown quant method bitsandbytes, then said:

RuntimeError: mat1 and mat2 shapes cannot be multiplied (4145x4096 and 1x12582912)
2024/07/31 08:43:36
2024-07-31T15:43:36.130106Z ERROR warmup{max_input_length=4095 max_prefill_tokens=4145 max_total_tokens=4096 max_batch_size=None}:warmup: text_generation_client: router/client/src/lib.rs:46: Server error: CANCELLED
2024/07/31 08:43:36
Error: WebServer(Warmup(Generation("CANCELLED")))
2024/07/31 08:43:36
2024-07-31T15:43:36.300019Z ERROR text_generation_launcher: Webserver Crashed
2024/07/31 08:43:36
2024-07-31T15:43:36.300043Z  INFO text_generation_launcher: Shutting down shards
2024/07/31 08:43:36
2024-07-31T15:43:36.372684Z  INFO shard-manager: text_generation_launcher: Terminating shard rank=0
2024/07/31 08:43:36
2024-07-31T15:43:36.372816Z  INFO shard-manager: text_generation_launcher: Waiting for shard to gracefully shutdown rank=0
2024/07/31 08:43:36
2024-07-31T15:43:36.473032Z  INFO shard-manager: text_generation_launcher: shard terminated rank=0
2024/07/31 08:43:36
Error: WebserverFailed```
and some others. Thoughts?

Now its saying:
2024/07/31 10:56:08
2024-07-31T17:56:08.516454Z  INFO text_generation_router::server: router/src/server.rs:1599: Using scheduler V3
2024/07/31 10:56:08
2024-07-31T17:56:08.516472Z  INFO text_generation_router::server: router/src/server.rs:1651: Setting max batch total tokens to 442032
2024/07/31 10:56:08
2024-07-31T17:56:08.570658Z  INFO text_generation_router::server: router/src/server.rs:1889: Connected

but not every being done initializing.

Is it a port issue?

Sign up or log in to comment