Ava small

Training Details

The fine-tuning process for this model involved several key parameters and settings:

Base Model: GPT-2
Dataset: Open Assistant's oasst1 dataset
Learning Rate: 1e-3
Epochs: 10
Hardware: GPU P100

The model was trained on a GPU P100 to expedite the training process and take advantage of the hardware's parallel processing capabilities. The learning rate was set to 1e-3 to balance the trade-off between fast convergence and avoiding overshooting.

Model Performance

After 10 epochs of training, the model achieved improved performance in generating coherent and contextually relevant responses in conversations. However, it's important to note that the model's responses might still exhibit occasional inaccuracies or inconsistencies.

Custom Tokens and Contextualization

To facilitate structured conversations and improve response generation, the following custom tokens were added:

<startoftext>: Marks the beginning of a conversation prompt.
<endoftext>: Marks the end of a conversation prompt.
<ava>: Denotes the beginning of responses generated by the AI assistant.
</ava>: Denotes the end of AI-generated responses.
<user>: Denotes the beginning of user input in the conversation.
</user>: Denotes the end of user input.

Here is example of prompting:

<startoftext><user>Hello</user><ava>Hello there, How can i assist you today?</ava></endoftext>

Use Cases and Applications

Given its training on dialogues and conversations, this fine-tuned model is particularly well-suited for the following use cases:

Dynamic and engaging conversations with users in chatbots or virtual assistants.
Providing personalized information and assistance across diverse domains.
Generating contextually relevant and creative responses to user inputs.
Enhancing the user experience and interaction quality.

Inference script

from transformers import GPT2LMHeadModel, GPT2Tokenizer

def inference(text, model, tokenizer):
    data = tokenizer.encode(f'<startoftext><user>{text}</user><ava>', return_tensors='pt')
    input_ids = data.to(device)
    
    output = model.generate(
        input_ids=input_ids,
        temperature=0.8,
        max_length=100,
        top_k=50,
        top_p=0.95,
        repetition_penalty=1.2,
        num_return_sequences=1,
    )

    decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
    ava_response = decoded_output.split('<ava>')[1].split('</ava>')[0]
    clean_response = ava_response.split('.')[0].strip()
    
    return clean_response

model_name = 'Kuduxaaa/ava-small'
model = GPT2LMHeadModel.from_pretrained(model_name)
tokenizer = GPT2Tokenizer.from_pretrained(model_name)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)

user_input = "What's the weather like today?"
response = inference(user_input, model, tokenizer)

print('Ava: ', response)

Kuduxaaa
/

ava-small