Edit model card

Model Card for Model ID

This is a quick test at building a model that engages in a more "pedagogically grounded" rhetoric when helping students brainstorm.

It was developed in about a day as a proof of concept.

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: Ryan Tannenbaum and For.Education
  • Model type: Llama 3.1 8B
  • Language(s) (NLP): English
  • License: MIT
  • Finetuned from model: Llama 3.1 8B

Model Sources [optional]

  • Demo [optional]: Coming soon

Uses

The model uses the following formatting:

### USER: <What the user says>

### ASSISTANT: <The bot response>

...

### TERMINATE

The model is programmed to "TERMINATE" the session when it reaches the end of its conversation

Bias, Risks, and Limitations

This model uses an incredibly small dataset to tackle a very specific use case. It is a proof of concept.

Training Data

Essay Dataset

Training Procedure

Trained with AutoTrain locally on 4090 card:

Training Hyperparameters

Epochs: 5 Learning Rate: 2e-5 Train Batch Size: 2 Mixed Precision: fp16 Quantization: int8

Model Card Authors [optional]

Ryan Tannenbaum (ryandt)

Model Card Contact

Ryan Tannenbaum

Downloads last month
10
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ryandt/dfe-l3.1-writing-strats