notdiamond-4k-0001
notdiamond-4k-0001 supports 4096 input sequence length. This model is an extention of notdiamond-0001 which originally supported sequence length 512.
LSG atttention is used to adapt existing pre-trained model to efficiently extrapolate to 4046 sequence length with no additional training.
notdiamond-0001 automatically determines whether to send queries to GPT-3.5 or GPT-4, depending on which model is best-suited for your task. notdiamond-0001 was trained on hundreds of thousands of data points from robust, cross-domain evaluation benchmarks. The notdiamond-0001 router model is a classifier and will return a label for either GPT-3.5 or GPT-4.
Inference:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# input format
query = "Can you write a function that counts from 1 to 10?"
formatted_prompt = f"""Determine whether the following query should be sent to GPT-3.5 or GPT-4.
Query:
{query}"""
tokenizer = AutoTokenizer.from_pretrained("notdiamond/notdiamond-0001")
model = AutoModelForSequenceClassification.from_pretrained("notdiamond/notdiamond-0001")
inputs = tokenizer(formatted_prompt,
truncation=True, max_length=4096, return_tensors="pt")
logits = model(**inputs).logits
model_id = logits.argmax().item()
id2label = {0: 'gpt-3.5', 1: 'gpt-4'}
model_to_call = id2label[model_id]
You can also access their free API and the official website : documentation.
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for arshadshk/notdiamond-4k-0001
Base model
notdiamond/notdiamond-0001