gradio pinecone-client transformers torch sentence-transformers==2.2.2 pinecone-text accelerate optimum auto-gptq llama-cpp-python