Deberta-FineWebEdu
This model is a fine-tuned version of microsoft/deberta-v3-xsmall on the FineWebSentences dataset. It achieves the following results on the evaluation set:
- Loss: 3.4314
- Accuracy: 0.4905
Model description
Finetuned on sentences from randomly chosen HuggingFaceFW/fineweb-edu entries.
Intended uses & limitations
To be finetuned on more tasks involving English sentences.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
The evaluation and training losses were similar indicating no overfitting.
Framework versions
- Transformers 4.39.3
- Pytorch 2.3.0+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for agentlans/deberta-finewebedu
Base model
microsoft/deberta-v3-xsmall